Inverse Reinforcement Learning in the Continuous Setting with Formal Guarantees.

Gregory Dexter Kevin Bello Jean Honorio

Published in: CoRR (2021)

Keyphrases

inverse reinforcement learning
partially observable environments
bayesian nonparametric
preference elicitation
temporal difference
reward function
artificial intelligence
function approximation