Variance Reduction for Deep Q-Learning Using Stochastic Recursive Gradient.
Haonan JiaXiao ZhangJun XuWei ZengHao JiangXiaohui YanPublished in: ICONIP (4) (2022)
Keyphrases
- variance reduction
- gradient estimation
- monte carlo
- stochastic approximation
- policy gradient
- sample size
- importance sampling
- reinforcement learning
- function approximation
- quasi monte carlo
- bias variance decomposition
- learning algorithm
- markov chain
- confidence intervals
- model free
- upper bound
- state space
- optimal policy
- machine learning
- particle filter
- gradient method
- least squares