Ranking policies in discrete Markov decision processes.
Peng DaiJudy GoldsmithPublished in: Ann. Math. Artif. Intell. (2010)
Keyphrases
- markov decision processes
- optimal policy
- markov decision process
- decision processes
- reward function
- average cost
- reinforcement learning
- finite state
- state space
- policy iteration algorithm
- decentralized control
- policy iteration
- partially observable markov decision processes
- transition matrices
- stationary policies
- planning under uncertainty
- reachability analysis
- discounted reward
- decision problems
- control policies
- macro actions
- infinite horizon
- expected reward
- finite horizon
- continuous state spaces
- average reward
- decision theoretic planning
- total reward
- dynamic programming
- action space
- continuous state
- partially observable
- factored mdps
- markov decision problems
- reinforcement learning algorithms
- sufficient conditions
- action sets
- finite number
- long run
- semi markov decision processes
- state abstraction
- inventory level