On Convergence of Value Iteration for a Class of Total Cost Markov Decision Processes.
Huizhen YuPublished in: SIAM J. Control. Optim. (2015)
Keyphrases
- markov decision processes
- total cost
- stochastic shortest path
- average cost
- policy iteration
- finite state
- optimal policy
- reinforcement learning
- state space
- dynamic programming
- infinite horizon
- optimal solution
- transition matrices
- reachability analysis
- finite horizon
- factored mdps
- minimum total cost
- service level
- planning under uncertainty
- holding cost
- decision theoretic planning
- action space
- markov decision process
- partially observable
- lead time
- data mining
- function approximation
- risk sensitive
- average reward
- convergence rate
- bayesian networks
- stationary policies
- inventory level
- state dependent
- probability distribution
- markov chain
- learning rate
- heuristic search
- planning problems