Deterministic policy gradient: Convergence analysis.
Huaqing XiongTengyu XuLin ZhaoYingbin LiangWei ZhangPublished in: UAI (2022)
Keyphrases
- neural network
- convergence analysis
- policy gradient
- approximation methods
- global convergence
- reinforcement learning
- gradient method
- optimality conditions
- function approximation
- optimal control
- convergence rate
- reinforcement learning algorithms
- variance reduction
- genetic algorithm
- multi objective
- partially observable markov decision processes
- belief state
- nonlinear programming
- average reward
- optimization methods
- decision problems
- multi agent systems
- optimal solution