Relax but stay in control: from value to algorithms for online Markov decision processes.
Peng GuanMaxim RaginskyRebecca WillettPublished in: CoRR (2013)
Keyphrases
- markov decision processes
- policy iteration
- reachability analysis
- factored mdps
- dynamic programming
- state space
- optimal policy
- finite state
- reinforcement learning
- least squares
- finite horizon
- decision theoretic planning
- stochastic shortest path
- learning algorithm
- transition matrices
- partially observable markov decision processes
- infinite horizon
- computational complexity
- planning under uncertainty
- policy evaluation
- average reward
- long run