Diffusion gradient temporal difference for cooperative reinforcement learning with linear function approximation.
Sergio Valcarcel MacuaPavle BelanovicSantiago ZazoPublished in: CIP (2012)
Keyphrases
- function approximation
- temporal difference
- reinforcement learning
- function approximators
- td learning
- model free
- temporal difference learning
- policy gradient
- reinforcement learning algorithms
- actor critic
- radial basis function
- temporal difference methods
- policy iteration
- learning tasks
- multi agent
- state space
- evaluation function
- policy evaluation
- monte carlo
- markov decision processes
- machine learning
- reinforcement learning problems
- dynamic programming
- learning agent
- e learning
- decision making
- data mining
- td methods
- neural network