Asymptotic Performance of Thompson Sampling in the Batched Multi-Armed Bandits.

Cem Kalkanli Ayfer Özgür

Published in: CoRR (2021)

Keyphrases

multi armed bandits
multi armed bandit
bandit problems
feature selection
machine learning
dynamic programming
worst case
closed form