Optimal Policies for a Pandemic: A Stochastic Game Approach and a Deep Learning Algorithm.
Yao XuanRobert BalkinJiequn HanRuimeng HuHéctor D. CenicerosPublished in: MSML (2021)
Keyphrases
- optimal policy
- learning algorithm
- reinforcement learning
- state dependent
- control policies
- markov decision processes
- decision problems
- dynamic programming
- finite horizon
- state space
- stochastic inventory control
- reinforcement learning algorithms
- multistage
- finite state
- average reward
- long run
- base stock policies
- serial inventory systems
- sufficient conditions
- average reward reinforcement learning
- markov decision process
- initial state
- infinite horizon
- machine learning
- sample path
- optimal strategy
- learning rate
- long run average cost
- dynamic programming algorithms
- policy iteration
- periodic review
- semi markov decision processes
- average cost
- monte carlo
- demand distributions
- lost sales
- inventory models
- stochastic process
- stochastic model
- model free
- policy evaluation
- markov decision problems
- optimal control
- game playing
- partially observable markov decision processes
- linear programming
- reward function