Exploration is Harder than Prediction: Cryptographically Separating Reinforcement Learning from Supervised Learning.
Noah GolowichAnkur MoitraDhruv RohatgiPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- supervised learning
- exploration strategy
- function approximation
- exploration exploitation
- action selection
- prediction accuracy
- temporal difference
- unsupervised learning
- active learning
- state space
- learning problems
- kernel based learning
- active exploration
- learning algorithm
- prediction error
- np hard
- semi supervised learning
- machine learning
- statistical learning
- autonomous learning
- markov decision processes
- supervised classification
- temporal difference learning
- reinforcement learning algorithms
- genetic algorithm
- prediction model
- learning tasks
- reinforcement learning methods
- optimal policy
- np complete
- evolutionary algorithm
- training set
- multi agent
- balancing exploration and exploitation
- prediction algorithm
- learning capabilities
- multiple instance learning
- transfer learning
- class labels
- training examples
- training samples
- unlabeled data
- labeled data
- learning process