Multi-Task Off-Policy Learning from Bandit Feedback.
Joey HongBranislav KvetonSumeet KatariyaManzil ZaheerMohammad GhavamzadehPublished in: CoRR (2022)
Keyphrases
- multi task
- multiple tasks
- multi task learning
- learning problems
- learning tasks
- multitask learning
- supervised learning
- learning environment
- learning process
- data mining
- unsupervised learning
- random sampling
- information gain
- transfer learning
- machine learning algorithms
- image features
- knn
- active learning
- training data
- learning algorithm
- machine learning