Samples are not all useful: Denoising policy gradient updates using variance.
Yannis Flet-BerliacPhilippe PreuxPublished in: CoRR (2019)
Keyphrases
- denoising
- policy gradient
- variance reduction
- image processing
- reinforcement learning
- actor critic
- model free reinforcement learning
- function approximation
- parametric optimization
- gradient method
- training samples
- optimal control
- sample size
- monte carlo
- reinforcement learning algorithms
- approximation methods
- training set
- dynamic programming
- feature space