Self-Guided Actor-Critic: Reinforcement Learning from Adaptive Expert Demonstrations.
Haoran ZhangChenkun YinYanxin ZhangShangtai JinPublished in: CDC (2020)
Keyphrases
- actor critic
- reinforcement learning
- temporal difference
- optimal control
- approximate dynamic programming
- policy gradient
- reinforcement learning algorithms
- neuro fuzzy
- gradient method
- function approximation
- policy iteration
- natural actor critic
- learning algorithm
- markov decision processes
- dynamic programming
- model free
- temporal difference learning
- machine learning
- adaptive control
- optimal policy
- least squares
- control problems
- policy gradient methods
- objective function
- fuzzy logic
- average reward
- supervised learning
- markov decision process
- linear program
- markov chain
- monte carlo
- transfer learning
- optimization methods
- evaluation function
- learning tasks