Login / Signup
On the Linear Convergence of Policy Gradient Methods for Finite MDPs.
Jalaj Bhandari
Daniel Russo
Published in:
AISTATS (2021)
Keyphrases
</>
markov decision processes
reinforcement learning
policy gradient methods
convergence speed
state space
natural actor critic
stochastic shortest path
dynamic environments
finite number
reward function
policy iteration
actor critic
neural network
dynamic programming
convergence rate
average cost