When does return-conditioned supervised learning work for offline reinforcement learning?
David BrandfonbrenerAlberto BiettiJacob BuckmanRomain LarocheJoan BrunaPublished in: NeurIPS (2022)
Keyphrases
- reinforcement learning
- supervised learning
- function approximation
- unsupervised learning
- reinforcement learning algorithms
- learning algorithm
- learning problems
- state space
- real time
- semi supervised
- kernel based learning
- multi agent
- semi supervised learning
- action selection
- optimal policy
- transfer learning
- active learning
- machine learning
- optimal control
- policy search
- data sets
- training samples
- unlabeled data
- dynamic programming
- statistical learning
- learning classifier systems
- model free
- learning process
- temporal difference
- real robot
- robot control
- control policy