Hybrid MDP based integrated hierarchical Q-learning.
Chunlin ChenDaoyi DongHan-Xiong LiTzyh Jong TarnPublished in: Sci. China Inf. Sci. (2011)
Keyphrases
- reinforcement learning
- optimal policy
- state space
- markov decision processes
- hierarchical reinforcement learning
- reward function
- markov decision process
- policy iteration
- learning algorithm
- discounted reward
- dynamic programming
- function approximation
- cooperative
- multi agent
- reinforcement learning algorithms
- hierarchical clustering
- stochastic shortest path
- average cost
- action selection
- learning rate
- hierarchical structure
- machine learning
- partially observable
- average reward
- state action
- temporal difference learning
- planning under uncertainty
- markov decision problems
- search algorithm