Conditional Value-at-Risk for Reachability and Mean Payoff in Markov Decision Processes.
Jan KretínskýTobias MeggendorferPublished in: LICS (2018)
Keyphrases
- markov decision processes
- state space
- dynamic programming
- optimal policy
- reinforcement learning
- finite state
- transition matrices
- game theory
- heuristic search
- partially observable
- reachability analysis
- factored mdps
- planning under uncertainty
- average reward
- policy iteration
- reinforcement learning algorithms
- reward function
- average cost
- infinite horizon
- decision processes
- decision theoretic planning
- markov chain
- markov decision process
- finite horizon
- initial state
- model based reinforcement learning
- stochastic games
- planning problems
- state and action spaces
- action sets
- search space
- belief state
- dynamical systems
- nash equilibrium
- action space