Inverse Reinforcement Learning through Policy Gradient Minimization.

Matteo Pirotta Marcello Restelli

Published in: AAAI (2016)

Keyphrases

inverse reinforcement learning
policy gradient
reward function
reinforcement learning algorithms
reinforcement learning
function approximation
preference elicitation
temporal difference
gradient method
optimal control
state action
average reward
state space
objective function
evaluation function
state variables
learning tasks
markov decision processes
partially observable
single agent
optimal policy