GRAC: Self-Guided and Self-Regularized Actor-Critic.
Lin ShaoYifan YouMengyuan YanQingyun SunJeannette BohgPublished in: CoRR (2020)
Keyphrases
- actor critic
- reinforcement learning
- optimal control
- policy gradient
- temporal difference
- approximate dynamic programming
- neuro fuzzy
- gradient method
- policy iteration
- reinforcement learning algorithms
- least squares
- function approximation
- dynamic programming
- evaluation function
- markov decision processes
- dynamical systems
- average reward
- neural network
- linear program
- partially observable markov decision processes
- dynamic environments
- temporal difference learning
- learning algorithm
- machine learning