Login / Signup
Identifiability in inverse reinforcement learning.
Haoyang Cao
Samuel N. Cohen
Lukasz Szpruch
Published in:
NeurIPS (2021)
Keyphrases
</>
inverse reinforcement learning
bayesian nonparametric
partially observable environments
preference elicitation
reward function
temporal difference
search algorithm
optimal policy
artificial intelligence