Global Convergence Using Policy Gradient Methods for Model-free Markovian Jump Linear Quadratic Control.
Santanu RathodManoj BhaduAbir DePublished in: CoRR (2021)
Keyphrases
- model free
- global convergence
- linear quadratic
- optimal control
- reinforcement learning
- global optimum
- convergence rate
- optimization methods
- function approximation
- closed loop
- reinforcement learning algorithms
- convergence speed
- dynamical systems
- temporal difference
- policy iteration
- neural network
- control method
- dynamic programming
- control strategy
- step size
- optimization method
- markov chain
- genetic algorithm