Stabilizing Policy Improvement for Large-Scale Infinite-Horizon Dynamic Programming.
Michael J. O'SullivanMichael A. SaundersPublished in: SIAM J. Matrix Anal. Appl. (2009)
Keyphrases
- infinite horizon
- dynamic programming
- optimal policy
- finite horizon
- optimal control
- markov decision process
- stochastic demand
- markov decision processes
- state space
- partially observable
- production planning
- long run
- dec pomdps
- average cost
- single item
- markov decision problems
- policy iteration
- holding cost
- reinforcement learning
- fixed cost
- multistage
- inventory policy
- lead time
- decision problems
- state dependent
- optimal production
- total reward
- finite state
- ordering cost
- linear programming
- periodic review
- sufficient conditions
- lost sales