Dynamic Policy Programming with Function Approximation.
Mohammad Gheshlaghi AzarVicenç GómezBert KappenPublished in: AISTATS (2011)
Keyphrases
- function approximation
- reinforcement learning
- function approximators
- temporal difference learning algorithms
- temporal difference learning
- reinforcement learning problems
- policy evaluation
- temporal difference
- policy gradient
- learning tasks
- radial basis function
- dynamic environments
- optimal policy
- model free
- action selection
- td learning
- exploration exploitation tradeoff
- data mining
- reinforcement learning algorithms
- artificial neural networks
- learning algorithm
- neural network