Login / Signup
Asymptotic Performance of Thompson Sampling for Batched Multi-Armed Bandits.
Cem Kalkanli
Ayfer Özgür
Published in:
IEEE Trans. Inf. Theory (2023)
Keyphrases
</>
multi armed bandits
multi armed bandit
bandit problems
random sampling
machine learning
monte carlo
lower bound
dynamic programming
upper bound
markov chain
sample size