Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian rewards.

Published in: ISIT (2023)

Keyphrases