A note on strong 1-optimal policies in Markov decision chains with unbounded costs.
Andrzej S. NowakPublished in: Math. Methods Oper. Res. (1999)
Keyphrases
- average cost
- markov decision chains
- optimal policy
- markov decision processes
- long run
- finite state
- finite horizon
- infinite horizon
- finite number
- total cost
- state space
- decision problems
- optimal control
- dynamic programming
- reinforcement learning
- multistage
- initial state
- risk sensitive
- holding cost
- markov decision problems
- setup cost
- linear program
- average reward
- linear programming
- dynamic programming algorithms
- markov decision process
- expected cost
- policy iteration
- control policy
- single item
- machine learning
- sufficient conditions