Login / Signup
Duplicated Replay Buffer for Asynchronous Deep Deterministic Policy Gradient.
Seyed Mohammad Seyed Motehayeri
Vahid Baghi
Ehsan Maani Miandoab
Ali Moeini
Published in:
CSICC (2021)
Keyphrases
</>
policy gradient
actor critic
parametric optimization
function approximation
reinforcement learning
model free reinforcement learning
gradient method
optimal control
reinforcement learning algorithms
variance reduction
approximation methods
average reward
search algorithm