Stochastic Variance-Reduced Policy Gradient.
Matteo PapiniDamiano BinaghiGiuseppe CanonacoMatteo PirottaMarcello RestelliPublished in: CoRR (2018)
Keyphrases
- policy gradient
- model free reinforcement learning
- variance reduction
- monte carlo
- actor critic
- parametric optimization
- reinforcement learning
- function approximation
- gradient method
- optimal control
- single agent
- importance sampling
- approximation methods
- average reward
- state transition
- reinforcement learning methods
- multi agent
- machine learning