Login / Signup
Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits.
Tianyuan Jin
Pan Xu
Xiaokui Xiao
Anima Anandkumar
Published in:
NeurIPS (2022)
Keyphrases
</>
multi armed bandits
bandit problems
multi armed bandit
learning algorithm
exponential family
lower bound
online learning
knn
random sampling
markov chain monte carlo
bregman divergences