Login / Signup
Identifiability in inverse reinforcement learning.
Haoyang Cao
Samuel N. Cohen
Lukasz Szpruch
Published in:
CoRR (2021)
Keyphrases
</>
inverse reinforcement learning
bayesian nonparametric
partially observable environments
preference elicitation
reward function
artificial intelligence
temporal difference
search algorithm