Sign in

Learning reward functions from diverse sources of human feedback: Optimally integrating demonstrations and preferences.

Erdem BiyikDylan P. LoseyMalayandi PalanNicholas C. LandolfiGleb ShevchukDorsa Sadigh
Published in: Int. J. Robotics Res. (2022)
Keyphrases
  • reinforcement learning
  • learning algorithm
  • motor skills
  • inverse reinforcement learning
  • state space
  • probability distribution
  • optimal policy
  • markov decision process
  • preference elicitation