Model-based controlled learning of MDP policies with an application to lost-sales inventory control.
Willem van JaarsveldPublished in: CoRR (2020)
Keyphrases
- inventory control
- lost sales
- optimal policy
- inventory models
- finite horizon
- periodic review
- stochastic demand
- learning algorithm
- lead time
- reinforcement learning
- inventory systems
- markov decision process
- supply chain
- base stock policies
- customer service
- linear programming
- inventory level
- multi item
- markov decision processes
- state space
- demand distributions