On the role of overparameterization in off-policy Temporal Difference learning with linear function approximation.
Valentin ThomasPublished in: NeurIPS (2022)
Keyphrases
- function approximation
- temporal difference learning
- temporal difference learning algorithms
- function approximators
- reinforcement learning
- temporal difference
- learning tasks
- radial basis function
- fixed point
- model free
- evaluation function
- reinforcement learning algorithms
- game playing
- basis functions
- supervised learning
- least squares
- support vector machine