DIRECT: Learning from Sparse and Shifting Rewards using Discriminative Reward Co-Training.
Philipp AltmannThomy PhanFabian RitzThomas GaborClaudia Linnhoff-PopienPublished in: CoRR (2023)
Keyphrases
- co training
- reinforcement learning
- semi supervised
- learning process
- learning algorithm
- supervised learning
- active learning
- multi view
- bandit problems
- labeled and unlabeled data
- learning tasks
- unsupervised learning
- text classification
- prior knowledge
- unlabeled data
- markov decision processes
- feature selection
- artificial intelligence
- information retrieval
- data mining