Sequential Bayes-Optimal Policies for Multiple Comparisons with a Known Standard.
Jing XiePeter I. FrazierPublished in: Oper. Res. (2013)
Keyphrases
- optimal policy
- markov decision processes
- decision problems
- state space
- finite horizon
- multistage
- dynamic programming
- reinforcement learning
- finite state
- long run
- average reward reinforcement learning
- average reward
- serial inventory systems
- markov decision process
- infinite horizon
- monte carlo
- computational complexity
- multi agent
- dynamic programming algorithms
- bayesian networks
- machine learning