On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations.
Tim G. J. RudnerCong LuMichael A. OsborneYarin GalYee Whye TehPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- function approximation
- expert advice
- learning algorithm
- computer assisted
- state space
- machine learning
- reinforcement learning algorithms
- markov decision processes
- multi agent
- human experts
- expert knowledge
- risk minimization
- temporal difference
- domain experts
- multi agent systems
- model free
- least squares
- optimal policy
- domain knowledge
- total least squares
- objective function
- kullback leibler
- neural network
- multi agent reinforcement learning
- transition model
- robotic control