Login / Signup
Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion.
Junghyun Lee
Se-Young Yun
Kwang-Sung Jun
Published in:
AISTATS (2024)
Keyphrases
</>
regret bounds
reinforcement learning
multi armed bandit
lower bound
expert advice
active learning
probability distribution