Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization.
Ke SunYafei WangYi LiuYingnan ZhaoBo PanShangling JuiBei JiangLinglong KongPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- stochastic approximation
- function approximation
- markov decision processes
- optimal control
- learning capabilities
- convergence rate
- state space
- optimal policy
- supervised learning
- convergence speed
- policy iteration
- dynamic programming
- fitted q iteration
- temporal difference learning
- deep learning
- blind source separation
- machine learning
- learning problems
- reinforcement learning algorithms
- multi agent
- multi agent reinforcement learning
- policy search
- video stabilization