Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian rewards.

Published in: CoRR (2023)

Keyphrases