Conditional Value-at-Risk for Random Immediate Reward Variables in Markov Decision Processes.
Masayuki KageyamaTakayuki FujiiKoji KanefujiHiroe TsubakiPublished in: Am. J. Comput. Math. (2011)
Keyphrases
- markov decision processes
- average reward
- reinforcement learning
- reward function
- total reward
- expected reward
- discounted reward
- transition matrices
- finite state
- policy iteration
- state space
- optimal policy
- stationary policies
- planning under uncertainty
- dynamic programming
- decision theoretic planning
- reinforcement learning algorithms
- finite horizon
- decision processes
- partially observable
- markov decision process
- action space
- reachability analysis
- state variables
- factored mdps
- state and action spaces
- multi agent
- function approximation
- stochastic shortest path
- semi markov decision processes
- least squares
- random variables
- state abstraction
- risk sensitive
- long run
- infinite horizon
- decision diagrams
- model based reinforcement learning
- partially observable markov decision processes
- machine learning