TOPS: Transition-based VOlatility-controlled Policy Search and its Global Convergence.
Liangliang XuAiwen JiangDaoming LyuBo LiuPublished in: CoRR (2022)
Keyphrases
- global convergence
- policy search
- global optimum
- convergence speed
- convergence rate
- optimization methods
- reinforcement learning
- continuous state
- reinforcement learning algorithms
- search space
- dynamic programming
- optimization method
- policy gradient
- partially observable markov decision processes
- learning algorithm
- particle swarm