Proximal Policy Gradient: PPO with Policy Gradient.
Ju-Seung ByunByungmoon KimHuamin WangPublished in: CoRR (2020)
Keyphrases
- policy gradient
- reinforcement learning
- actor critic
- function approximation
- parametric optimization
- gradient method
- model free reinforcement learning
- optimal control
- reinforcement learning algorithms
- approximation methods
- reinforcement learning methods
- variance reduction
- single agent
- neural network
- average reward
- model free
- long run
- learning algorithm