Login / Signup
High-Probability Regret Bounds for Bandit Online Linear Optimization.
Peter L. Bartlett
Varsha Dani
Thomas P. Hayes
Sham M. Kakade
Alexander Rakhlin
Ambuj Tewari
Published in:
COLT (2008)
Keyphrases
</>
regret bounds
online learning
multi armed bandit
online convex optimization
lower bound
linear regression
e learning
upper bound
quadratic programming
bregman divergences
reinforcement learning
optimal solution
active learning
special case
least squares
closed form