Login / Signup
Distributional Advantage Actor-Critic.
Shangda Li
Selina Bing
Steven Yang
Published in:
CoRR (2018)
Keyphrases
</>
actor critic
reinforcement learning
temporal difference
policy gradient
gradient method
optimal control
approximate dynamic programming
reinforcement learning algorithms
function approximation
average reward
neural network
neuro fuzzy