Login / Signup
Open Problem: First-Order Regret Bounds for Contextual Bandits.
Alekh Agarwal
Akshay Krishnamurthy
John Langford
Haipeng Luo
Robert E. Schapire
Published in:
COLT (2017)
Keyphrases
</>
regret bounds
multi armed bandit
lower bound
online learning
linear regression
upper bound
higher order
learning theory
special case
data points
least squares
bregman divergences
online convex optimization