-Present-Value Optimal Policies in Markov Decision Chains.
Michael J. O'SullivanArthur F. Veinott Jr.Published in: Math. Oper. Res. (2017)
Keyphrases
- markov decision chains
- optimal policy
- average cost
- finite state
- markov decision processes
- risk sensitive
- long run
- finite horizon
- decision problems
- infinite horizon
- dynamic programming
- state space
- control policies
- reinforcement learning
- multistage
- sufficient conditions
- markov decision process
- average reward
- policy iteration
- partially observable markov decision processes
- initial state
- markov decision problems
- control policy
- expected cost
- average reward reinforcement learning
- search algorithm
- inventory level
- reward function
- optimal strategy
- finite number