Variance Reduction for Deep Q-Learning using Stochastic Recursive Gradient.
Haonan JiaXiao ZhangJun XuWei ZengHao JiangXiaohui YanJi-Rong WenPublished in: CoRR (2020)
Keyphrases
- variance reduction
- gradient estimation
- monte carlo
- stochastic approximation
- policy gradient
- reinforcement learning
- importance sampling
- sample size
- bias variance decomposition
- function approximation
- quasi monte carlo
- state space
- markov chain
- confidence intervals
- naive bayes classifier
- reinforcement learning algorithms
- gradient method
- learning algorithm
- optimal policy
- trade off
- bayesian networks
- policy iteration
- naive bayes