Login / Signup
Softmax Policy Gradient Methods Can Take Exponential Time to Converge.
Gen Li
Yuting Wei
Yuejie Chi
Yuantao Gu
Yuxin Chen
Published in:
CoRR (2021)
Keyphrases
</>
policy gradient methods
natural actor critic
robot arm
reinforcement learning
multi agent
policy gradient
function approximation
activation function