A new strong optimality criterion for nonstationary Markov decision processes.
Xianping GuoPeng ShiWeiping ZhuPublished in: Math. Methods Oper. Res. (2000)
Keyphrases
- non stationary
- markov decision processes
- optimality criterion
- average reward
- risk sensitive
- optimal policy
- finite horizon
- policy iteration
- discounted reward
- finite state
- dynamic programming
- reinforcement learning
- state space
- total reward
- long run
- decision theoretic planning
- average cost
- random fields
- evaluation function
- state and action spaces
- markov chain
- transition matrices
- stationary policies
- infinite horizon
- partially observable
- model free