Inverse Reinforcement Learning from a Gradient-based Learner.

Giorgia Ramponi Gianluca Drappo Marcello Restelli

Published in: NeurIPS (2020)

Keyphrases

inverse reinforcement learning
bayesian nonparametric
partially observable environments
preference elicitation
reward function
learning process
temporal difference
artificial intelligence
markov decision processes