Mixed Policy Gradient.
Yang GuanJingliang DuanShengbo Eben LiJie LiJianyu ChenBo ChengPublished in: CoRR (2021)
Keyphrases
- policy gradient
- actor critic
- parametric optimization
- gradient method
- function approximation
- reinforcement learning
- optimal control
- approximation methods
- model free reinforcement learning
- variance reduction
- single agent
- dynamic programming
- multi agent
- average reward
- reinforcement learning algorithms
- optimal policy
- learning algorithm