Discounted Markov decision processes with utility constraints.
Yoshinobu KadotaMasami KuranoMasami YasudaPublished in: Comput. Math. Appl. (2006)
Keyphrases
- markov decision processes
- optimal policy
- reinforcement learning
- finite state
- state space
- policy iteration
- dynamic programming
- transition matrices
- infinite horizon
- reinforcement learning algorithms
- average reward
- finite horizon
- planning under uncertainty
- average cost
- decision processes
- partially observable
- markov decision process
- objective function
- discounted reward
- risk sensitive
- action space
- decision problems
- markov chain
- sufficient conditions
- multistage
- decision theoretic
- heuristic search
- function approximation
- decision theoretic planning
- reachability analysis
- semi markov decision processes
- utility function