Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning.
Shan ZhongQuan LiuQi-ming FuPublished in: Comput. Intell. Neurosci. (2016)
Keyphrases
- actor critic
- learning algorithm
- reinforcement learning
- policy gradient
- optimal control
- np hard
- hierarchical model
- dynamic programming
- gradient method
- learning tasks
- objective function
- active learning
- approximate dynamic programming
- probabilistic model
- optimal solution
- temporal difference
- latent variables
- average reward
- model free
- convergence rate
- linear program
- human body
- linear programming
- cost function
- prior knowledge