Learning Reward Functions by Integrating Human Demonstrations and Preferences.

Malayandi Palan Nicholas C. Landolfi Gleb Shevchuk Dorsa Sadigh

Published in: CoRR (2019)

Keyphrases

active learning
learning algorithm
inverse reinforcement learning
reinforcement learning
supervised learning
prior knowledge
markov chain
sufficient conditions
preference elicitation