Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences.
Aadirupa SahaPierre GaillardPublished in: CoRR (2022)
Keyphrases
- online learning
- higher education
- regret bounds
- online course
- computer mediated
- e learning
- user preferences
- online learning environments
- distance learning
- blended learning
- corporate training
- world model
- online algorithms
- machine learning
- active learning
- decision making
- preference relations
- statistical analysis
- online learning environment
- classroom learning