Login / Signup

The Virtues of Pessimism in Inverse Reinforcement Learning.

David WuGokul SwamyJ. Andrew BagnellZhiwei Steven WuSanjiban Choudhury
Published in: CoRR (2024)
Keyphrases
  • inverse reinforcement learning
  • bayesian nonparametric
  • partially observable environments
  • preference elicitation
  • reward function
  • temporal difference
  • case based reasoning
  • control system