Login / Signup
Learning reward functions from diverse sources of human feedback: Optimally integrating demonstrations and preferences.
Erdem Biyik
Dylan P. Losey
Malayandi Palan
Nicholas C. Landolfi
Gleb Shevchuk
Dorsa Sadigh
Published in:
Int. J. Robotics Res. (2022)
Keyphrases
</>
reinforcement learning
learning algorithm
motor skills
inverse reinforcement learning
state space
probability distribution
optimal policy
markov decision process
preference elicitation