Offline Reinforcement Learning with Fisher Divergence Critic Regularization.
Ilya KostrikovJonathan TompsonRob FergusOfir NachumPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- actor critic
- function approximation
- reinforcement learning algorithms
- temporal difference
- policy gradient
- information geometry
- model free
- markov decision processes
- state space
- optimal control
- learning algorithm
- prior information
- optimal policy
- action selection
- real time
- control policy
- function approximators
- natural actor critic
- temporal difference learning
- bregman divergences
- partially observable
- learning process
- information theory
- neyman pearson
- reinforcement learning problems
- reward function
- policy search
- relative entropy
- action space
- feature space
- machine learning