Preference-based Online Learning with Dueling Bandits: A Survey.
Róbert Busa-FeketeEyke HüllermeierAdil El Mesaoudi-PaulPublished in: CoRR (2018)
Keyphrases
- online learning
- regret bounds
- online course
- e learning
- stochastic systems
- higher education
- distance education
- blended learning
- computer mediated
- learning management systems
- multi armed bandits
- distance learning
- active learning
- online algorithms
- user preferences
- online education
- preference relations
- online learning environments
- upper bound