Login / Signup
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization.
Yuhao Ding
Junzi Zhang
Javad Lavaei
Published in:
CoRR (2021)
Keyphrases
</>
policy gradient methods
natural actor critic
monte carlo
convergence rate
convergence speed
robot arm
neural network
machine learning
dynamic programming
least squares