Login / Signup
Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback.
Chicheng Zhang
Alekh Agarwal
Hal Daumé III
John Langford
Sahand Negahban
Published in:
ICML (2019)
Keyphrases
</>
multi armed bandit
semi supervised
starting point
contextual information
regret bounds
machine learning
user feedback
context sensitive
unsupervised learning
supervised learning
learning algorithm
markov chain
relevance feedback
pairwise
case study
information systems
context dependent
data mining