Login / Signup
Instance-dependent Regret Bounds for Dueling Bandits.
Akshay Balsubramani
Zohar S. Karnin
Robert E. Schapire
Masrour Zoghi
Published in:
COLT (2016)
Keyphrases
</>
regret bounds
multi armed bandit
online learning
linear regression
lower bound
upper bound
feature selection
optimal solution
support vector
bregman divergences
online convex optimization