Performance Guarantees for Empirical Markov Decision Processes with Applications to Multiperiod Inventory Models.
William L. CooperBharath RangarajanPublished in: Oper. Res. (2012)
Keyphrases
- inventory models
- markov decision processes
- finite horizon
- average cost
- optimal policy
- markov decision process
- inventory control
- finite state
- state space
- single period
- initial state
- reinforcement learning
- infinite horizon
- dynamic programming
- policy iteration
- lost sales
- control policies
- partially observable
- average reward
- action space
- machine learning
- decision problems
- multi agent
- reward function
- state dependent
- long run
- single item
- cost function
- fixed cost
- service level