Linear convergence of a policy gradient method for finite horizon continuous time stochastic control problems.
Christoph ReisingerWolfgang StockingerYufei ZhangPublished in: CoRR (2022)
Keyphrases
- control problems
- optimal control
- finite horizon
- infinite horizon
- gradient method
- convergence rate
- control policies
- average cost
- stochastic control
- optimal policy
- reinforcement learning
- periodic review
- brownian motion
- dynamic programming
- inventory control
- markov decision process
- continuous state spaces
- policy gradient
- single product
- markov decision processes
- convergence speed
- step size
- state space
- control strategy
- optimization methods
- inventory policy
- learning rate
- ordering cost
- state dependent
- multistage
- control law
- negative matrix factorization
- markov chain
- sufficient conditions
- markov processes
- policy iteration
- machine learning
- evolutionary algorithm
- approximate dynamic programming
- finite state
- initial state
- stochastic process
- lost sales
- control policy
- function approximators
- inventory level