Temporal Difference Based Actor Critic Learning - Convergence and Neural Implementation.
Dotan Di CastroDmitry VolkinshteinRon MeirPublished in: NIPS (2008)
Keyphrases
- actor critic
- temporal difference
- reinforcement learning
- td learning
- function approximation
- policy gradient
- model free
- action selection
- temporal difference learning
- optimal control
- learning algorithm
- approximate dynamic programming
- neuro fuzzy
- evaluation function
- monte carlo
- gradient method
- supervised learning
- step size
- reinforcement learning algorithms
- policy iteration
- learning tasks
- learning process
- learning problems
- active learning
- training data
- neural network