Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min(Max) Polynomial Bellman Equations
Kousha EtessamiAlistair StewartMihalis YannakakisPublished in: CoRR (2012)
Keyphrases
- markov decision processes
- min max
- policy iteration
- factored mdps
- computational complexity
- finite state
- reinforcement learning
- planning under uncertainty
- optimal policy
- learning algorithm
- dynamic programming
- decision theoretic planning
- transition matrices
- stochastic shortest path
- state space
- generative model
- partially observable markov decision processes
- policy evaluation
- reachability analysis