Efficient Algorithms for Finite Horizon and Streaming Restless Multi-Armed Bandit Problems.
Aditya MateArpita BiswasChristoph SiebenbrunnerMilind TambePublished in: CoRR (2021)
Keyphrases
- finite horizon
- multi armed bandit problems
- infinite horizon
- optimal control
- optimal policy
- optimal stopping
- markov decision processes
- average cost
- bandit problems
- inventory models
- single product
- inventory control
- multistage
- yield management
- markov decision process
- production planning
- dynamic programming
- reinforcement learning
- single item
- real time
- long run
- finite state
- learning algorithm
- lot size
- non stationary
- multi agent
- ordering cost