Closing the Gap between TD Learning and Supervised Learning - A Generalisation Point of View.
Raj GhugareMatthieu GeistGlen BersethBenjamin EysenbachPublished in: ICLR (2024)
Keyphrases
- td learning
- temporal difference
- supervised learning
- reinforcement learning
- evaluation function
- function approximation
- unsupervised learning
- reinforcement learning algorithms
- learning tasks
- step size
- model free
- semi supervised learning
- training data
- monte carlo
- machine learning
- semi supervised
- action selection
- statistical learning
- policy evaluation
- policy iteration
- training set
- active learning
- learning algorithm
- multi step
- decision trees
- support vector machine svm