Near-Optimal Regret Bounds for Thompson Sampling.

Shipra Agrawal Navin Goyal

Published in: J. ACM (2017)

Keyphrases

multi armed bandit
regret bounds
random sampling
lower bound
sample size
pairwise
sampling algorithm
active learning
probabilistic model
upper bound
online learning