Online convex optimization in the bandit setting: gradient descent without a gradient.

Abraham Flaxman Adam Tauman Kalai H. Brendan McMahan

Published in: SODA (2005)

Keyphrases

regret bounds
online convex optimization
online learning
lower bound
cost function
multi armed bandit
upper bound
loss function
objective function
linear regression
long run
feature selection
image processing
higher level
bregman divergences