A Marginal Productivity Index Policy for the Finite-Horizon Multiarmed Bandit Problem.
José Niño-MoraPublished in: CDC/ECC (2005)
Keyphrases
- finite horizon
- multiarmed bandit
- optimal policy
- infinite horizon
- optimal stopping
- markov decision processes
- markov decision process
- single product
- inventory models
- inventory control
- control policies
- multistage
- single item
- decision problems
- dynamic programming
- stochastic demand
- average cost
- state space
- reinforcement learning
- long run
- periodic review
- non stationary
- sufficient conditions
- yield management
- state dependent
- lot size
- probability distribution
- ordering cost
- production planning
- optimal control