A Variational Formula for Risk-Sensitive Reward.
V. AnantharamVivek S. BorkarPublished in: SIAM J. Control. Optim. (2017)
Keyphrases
- risk sensitive
- optimal control
- markov decision processes
- reinforcement learning
- model free
- utility function
- average reward
- reward function
- markov decision chains
- expected utility
- optimality criterion
- control policies
- decision theoretic
- long run
- state space
- reinforcement learning algorithms
- probability distribution
- decision makers
- real time
- multistage
- dynamic programming
- efficient optimization
- markov decision problems
- control policy
- average cost
- decision theory
- optimal policy
- function approximation