Inverse Reinforcement Learning through Policy Gradient Minimization.
Matteo PirottaMarcello RestelliPublished in: AAAI (2016)
Keyphrases
- inverse reinforcement learning
- policy gradient
- reward function
- reinforcement learning algorithms
- reinforcement learning
- function approximation
- preference elicitation
- temporal difference
- gradient method
- optimal control
- state action
- average reward
- state space
- objective function
- evaluation function
- state variables
- learning tasks
- markov decision processes
- partially observable
- single agent
- optimal policy