Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning.
Mudit VermaSiddhant BhambriSubbarao KambhampatiPublished in: CoRR (2023)
Keyphrases
- unlabeled data
- semi supervised learning
- labeled data
- semi supervised
- reinforcement learning
- learning algorithm
- active learning
- supervised learning
- co training
- semi supervised classification
- training data
- transfer learning
- training set
- labeled training data
- class labels
- text categorization
- labeled examples
- text classification
- domain adaptation
- supervised learning algorithms
- unsupervised learning
- data points
- pairwise
- labeled and unlabeled data
- machine learning
- data sets
- small number
- machine learning algorithms
- relevance feedback
- positive examples
- feature selection
- information retrieval
- data mining
- neural network