Deviations of Stochastic Bandit Regret.

Antoine Salomon Jean-Yves Audibert

Published in: ALT (2011)

Keyphrases

regret bounds
multi armed bandit
bandit problems
online learning
lower bound
upper confidence bound
linear regression
expert advice
multi armed bandit problems
upper bound
monte carlo
binary classification
stochastic optimization
reinforcement learning
sufficient conditions
support vector
bayesian networks
confidence bounds
contextual bandit
case study
e learning