Forward Actor-Critic for Nonlinear Function Approximation in Reinforcement Learning.
Vivek VeeriahHarm van SeijenRichard S. SuttonPublished in: AAMAS (2017)
Keyphrases
- function approximation
- actor critic
- reinforcement learning
- temporal difference
- policy gradient
- reinforcement learning algorithms
- temporal difference learning
- model free
- approximate dynamic programming
- function approximators
- state space
- learning tasks
- policy iteration
- optimal control
- markov decision processes
- control problems
- machine learning
- radial basis function
- learning problems
- sufficient conditions
- action selection
- learning algorithm
- neuro fuzzy
- optimal policy
- gradient method
- artificial neural networks
- multi agent
- average reward
- supervised learning