A Turnpike Theorem For A Risk-Sensitive Markov Decision Process with Stopping.
Eric V. DenardoUriel G. RothblumPublished in: SIAM J. Control. Optim. (2006)
Keyphrases
- risk sensitive
- markov decision process
- markov decision processes
- optimal policy
- optimal control
- state space
- average cost
- reinforcement learning
- infinite horizon
- finite horizon
- control policies
- finite state
- policy iteration
- dynamic programming
- reward function
- action space
- markov decision problems
- reinforcement learning algorithms
- decision making
- decision processes
- partially observable
- long run
- average reward
- linear program
- expected utility
- fixed point
- optimality criterion
- function approximation