Login / Signup
Phasic Policy Gradient.
Karl Cobbe
Jacob Hilton
Oleg Klimov
John Schulman
Published in:
CoRR (2020)
Keyphrases
</>
policy gradient
actor critic
reinforcement learning
parametric optimization
gradient method
function approximation
optimal control
variance reduction
reinforcement learning algorithms
approximation methods
model free reinforcement learning
markov chain
single agent
average reward
state action