Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning.
Shixiang GuTimothy P. LillicrapZoubin GhahramaniRichard E. TurnerBernhard SchölkopfSergey LevinePublished in: CoRR (2017)
Keyphrases
- policy gradient
- gradient estimation
- actor critic
- variance reduction
- reinforcement learning
- policy search
- policy gradient methods
- monte carlo
- gradient method
- reinforcement learning algorithms
- function approximation
- model free reinforcement learning
- optimal control
- sample size
- state space
- average reward
- learning algorithm
- approximation methods
- naive bayes classifier
- function approximators
- single agent
- model free
- neuro fuzzy
- machine learning