Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods.
Xin GuoAnran HuJunzi ZhangPublished in: CoRR (2021)
Keyphrases
- theoretical guarantees
- global convergence
- reinforcement learning
- policy gradient methods
- learning algorithm
- convergence rate
- optimization methods
- convergence analysis
- natural actor critic
- machine learning
- policy iteration
- worst case
- model free
- optimal control
- convergence speed
- markov decision processes
- linear programming
- robot arm
- reinforcement learning methods
- computational complexity
- neural network