Online planning for large MDPs with MAXQ decomposition.
Aijun BaiFeng WuXiaoping ChenPublished in: AAMAS (2012)
Keyphrases
- reinforcement learning
- stochastic domains
- planning problems
- markov decision processes
- state space
- online learning
- probabilistic planning
- planning under uncertainty
- decomposition method
- decision theoretic planning
- heuristic search
- real time
- partially observable
- initial state
- markov decision problems
- machine learning
- least squares
- dynamic programming
- decomposition algorithm