Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs.
Lior ShaniYonathan EfroniShie MannorPublished in: AAAI (2020)
Keyphrases
- global convergence
- trust region
- global optimum
- line search
- optimization methods
- optimization method
- newton method
- unconstrained optimization
- convergence analysis
- objective function
- faster convergence
- convergence rate
- convergence speed
- risk minimization
- simulated annealing
- markov decision processes
- optimization problems
- search space
- particle swarm
- step size
- state space
- optimal solution
- least squares
- reinforcement learning
- quadratic programming
- differential evolution
- optimization algorithm
- dynamic programming
- conjugate gradient
- multi objective
- particle swarm optimization
- linear programming