Convergence of policy gradient methods for finite-horizon stochastic linear-quadratic control problems.
Michael GiegrichChristoph ReisingerYufei ZhangPublished in: CoRR (2022)
Keyphrases
- optimal control
- control problems
- linear quadratic
- infinite horizon
- finite horizon
- policy gradient
- average cost
- dynamic programming
- reinforcement learning
- brownian motion
- control strategy
- optimal policy
- partially observable
- markov decision process
- policy iteration
- reinforcement learning methods
- markov decision processes
- control law
- initial state
- real time
- learning algorithm
- data mining