Closing the Gap between TD Learning and Supervised Learning - A Generalisation Point of View.
Raj GhugareMatthieu GeistGlen BersethBenjamin EysenbachPublished in: CoRR (2024)
Keyphrases
- td learning
- temporal difference
- supervised learning
- reinforcement learning
- evaluation function
- function approximation
- unsupervised learning
- learning tasks
- reinforcement learning algorithms
- policy evaluation
- monte carlo
- model free
- step size
- statistical learning
- training data
- learning algorithm
- semi supervised learning
- action selection
- training set
- class labels
- active learning
- labeled data
- multi step
- policy iteration
- semi supervised