Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods.
Xin GuoAnran HuJunzi ZhangPublished in: AAAI (2022)
Keyphrases
- theoretical guarantees
- global convergence
- reinforcement learning
- convergence analysis
- convergence rate
- policy gradient methods
- optimization methods
- learning algorithm
- optimization problems
- worst case
- evolutionary algorithm
- metaheuristic
- linear programming
- global optimum
- policy iteration
- dynamic programming
- machine learning