Linear Convergence of a Policy Gradient Method for Some Finite Horizon Continuous Time Control Problems.
Christoph ReisingerWolfgang StockingerYufei ZhangPublished in: SIAM J. Control. Optim. (2023)
Keyphrases
- control problems
- optimal control
- infinite horizon
- finite horizon
- gradient method
- convergence rate
- average cost
- actor critic
- optimal policy
- dynamic programming
- reinforcement learning
- step size
- single product
- markov decision process
- ordering cost
- partially observable
- convergence speed
- stochastic control
- control strategy
- control policies
- learning rate
- multistage
- control law
- optimization methods
- markov decision processes
- brownian motion
- markov decision problems
- policy iteration
- inventory policy
- partially observable markov decision processes
- state space
- mathematical model