Inverse Reinforcement Learning with Locally Consistent Reward Functions.

Quoc Phong Nguyen Kian Hsiang Low Patrick Jaillet

Published in: NIPS (2015)

Keyphrases

inverse reinforcement learning
reward function
preference elicitation
markov decision processes
reinforcement learning
state space
optimal policy
reinforcement learning algorithms
partially observable
simple examples
multiple agents
transition probabilities
markov decision process
dynamic systems
function approximation
infinite horizon