Login / Signup
Simultaneous Double Q-learning with Conservative Advantage Learning for Actor-Critic Methods.
Qing Li
Wengang Zhou
Zhenbo Lu
Houqiang Li
Published in:
CoRR (2022)
Keyphrases
</>
reinforcement learning
actor critic
learning algorithm
reinforcement learning methods
learning tasks
function approximation
action selection
reinforcement learning algorithms
state space
temporal difference
function approximators
gradient method
cost function
temporal difference learning