Login / Signup
Open Problem: Regret Bounds for Thompson Sampling.
Lihong Li
Olivier Chapelle
Published in:
COLT (2012)
Keyphrases
</>
multi armed bandit
regret bounds
reinforcement learning
random sampling
sample size
lower bound
maximum likelihood
monte carlo
linear regression
similarity measure
active learning
upper bound
multi class
bregman divergences