Applying Online Expert Supervision in Deep Actor-Critic Reinforcement Learning.
Jin ZhangJiansheng ChenYiqing HuangWeitao WangTianpeng LiPublished in: PRCV (2) (2018)
Keyphrases
- actor critic
- reinforcement learning
- temporal difference
- policy gradient
- optimal control
- approximate dynamic programming
- reinforcement learning algorithms
- function approximation
- gradient method
- neuro fuzzy
- policy iteration
- state space
- optimal policy
- model free
- machine learning
- multi agent
- learning algorithm
- natural actor critic
- policy gradient methods
- control problems
- control policy
- average reward
- dynamic programming
- rl algorithms
- evaluation function
- markov decision processes
- reward function
- transfer learning
- fuzzy logic
- temporal difference learning
- neural network