Active Learning for Reward Estimation in Inverse Reinforcement Learning.

Manuel Lopes Francisco S. Melo Luis Montesano

Published in: ECML/PKDD (2) (2009)

Keyphrases

inverse reinforcement learning
active learning
partially observable environments
bayesian nonparametric
reward function
preference elicitation
random sampling
temporal difference
artificial intelligence
learning algorithm
multi attribute
special case
partially observable
reinforcement learning
machine learning
semi supervised
reinforcement learning algorithms
training data