Login / Signup
Softmax Policy Gradient Methods Can Take Exponential Time to Converge.
Gen Li
Yuting Wei
Yuejie Chi
Yuantao Gu
Yuxin Chen
Published in:
COLT (2021)
Keyphrases
</>
policy gradient methods
natural actor critic
robot arm
policy gradient
machine learning
fixed point
optimal control