Login / Signup
UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem.
Peter Auer
Ronald Ortner
Published in:
Period. Math. Hung. (2010)
Keyphrases
</>
multi armed bandit
regret bounds
reinforcement learning
linear regression
lower bound
online learning
upper bound
image sequences
training data
optimal solution
text classification