Compatible Reward Inverse Reinforcement Learning.

Alberto Maria Metelli Matteo Pirotta Marcello Restelli

Published in: NIPS (2017)

Keyphrases

inverse reinforcement learning
partially observable environments
bayesian nonparametric
reward function
preference elicitation
temporal difference
reinforcement learning
monte carlo
resource allocation
markov decision processes
state space