Preference-based Online Learning with Dueling Bandits: A Survey.

Róbert Busa-Fekete Eyke Hüllermeier Adil El Mesaoudi-Paul

Published in: CoRR (2018)

Keyphrases

online learning
regret bounds
online course
e learning
stochastic systems
higher education
distance education
blended learning
computer mediated
learning management systems
multi armed bandits
distance learning
active learning
online algorithms
user preferences
online education
preference relations
online learning environments
upper bound