Policy-based branch-and-bound for infinite-horizon Multi-model Markov decision processes.
Vinayak S. AhluwaliaLauren N. SteimleBrian T. DentonPublished in: Comput. Oper. Res. (2021)
Keyphrases
- infinite horizon
- markov decision processes
- optimal policy
- markov decision process
- finite horizon
- branch and bound
- policy iteration
- state space
- average cost
- state dependent
- discount factor
- mathematical model
- finite state
- reinforcement learning
- optimal control
- long run
- upper bound
- production planning
- reinforcement learning algorithms
- combinatorial optimization
- sufficient conditions
- probabilistic model
- np hard
- search space
- optimal solution
- average reward
- machine learning