Inverse Reinforcement Learning from a Gradient-based Learner.

Giorgia Ramponi Gianluca Drappo Marcello Restelli

Published in: CoRR (2020)

Keyphrases

inverse reinforcement learning
bayesian nonparametric
partially observable environments
preference elicitation
reward function
learning process
utility function
data mining
search algorithm
probabilistic model
markov decision processes
temporal difference
target concept
simple examples