Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning.
Alessandro MontenegroMarco MussiMatteo PapiniAlberto Maria MetelliPublished in: CoRR (2024)
Keyphrases
- global convergence
- reinforcement learning
- optimal policy
- policy search
- global optimum
- convergence speed
- optimization methods
- convergence rate
- convergence analysis
- action selection
- markov decision process
- markov decision processes
- constrained optimization problems
- control policy
- convex minimization
- function approximators
- line search
- reinforcement learning problems
- reinforcement learning algorithms
- action space
- policy iteration
- state space
- markov decision problems
- coordinate ascent
- actor critic
- partially observable
- policy gradient
- function approximation
- reward function
- search space
- globally convergent
- neural network
- evolutionary algorithm
- partially observable markov decision processes
- learning algorithm
- hybrid algorithm
- particle swarm
- machine learning
- cost function
- dynamic programming
- average reward
- model free