On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations.

Tim G. J. Rudner Cong Lu Michael A. Osborne Yarin Gal Yee Whye Teh

Published in: CoRR (2022)

Keyphrases

reinforcement learning
function approximation
expert advice
learning algorithm
computer assisted
state space
machine learning
reinforcement learning algorithms
markov decision processes
multi agent
human experts
expert knowledge
risk minimization
temporal difference
domain experts
multi agent systems
model free
least squares
optimal policy
domain knowledge
total least squares
objective function
kullback leibler
neural network
multi agent reinforcement learning
transition model
robotic control