Bounded regret in stochastic multi-armed bandits.

Sébastien Bubeck Vianney Perchet Philippe Rigollet

Published in: COLT (2013)

Keyphrases

multi armed bandits
multi armed bandit
bandit problems
reinforcement learning
decision problems
regret bounds
decision making
lower bound
online learning