Closing the Gap between TD Learning and Supervised Learning - A Generalisation Point of View.

Raj Ghugare Matthieu Geist Glen Berseth Benjamin Eysenbach

Published in: CoRR (2024)

Keyphrases

td learning
temporal difference
supervised learning
reinforcement learning
evaluation function
function approximation
unsupervised learning
learning tasks
reinforcement learning algorithms
policy evaluation
monte carlo
model free
step size
statistical learning
training data
learning algorithm
semi supervised learning
action selection
training set
class labels
active learning
labeled data
multi step
policy iteration
semi supervised