Login / Signup
The Virtues of Pessimism in Inverse Reinforcement Learning.
David Wu
Gokul Swamy
J. Andrew Bagnell
Zhiwei Steven Wu
Sanjiban Choudhury
Published in:
CoRR (2024)
Keyphrases
</>
inverse reinforcement learning
bayesian nonparametric
partially observable environments
preference elicitation
reward function
temporal difference
case based reasoning
control system