Offline Reinforcement Learning with Fisher Divergence Critic Regularization.
Ilya KostrikovRob FergusJonathan TompsonOfir NachumPublished in: ICML (2021)
Keyphrases
- reinforcement learning
- actor critic
- temporal difference
- function approximation
- reinforcement learning algorithms
- policy gradient
- information geometry
- state space
- markov decision processes
- optimal control
- approximate dynamic programming
- model free
- real time
- regularization parameter
- temporal difference learning
- learning algorithm
- evaluation function
- machine learning
- reinforcement learning methods
- optimal policy
- neuro fuzzy
- action selection
- supervised learning
- policy search
- loss function
- parameter selection
- gradient method
- denoising
- monte carlo
- fisher information
- dynamic programming
- regularization term