Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits.

Haipeng Luo Mengxiao Zhang Peng Zhao Zhi-Hua Zhou

Published in: COLT (2022)

Keyphrases

regret bounds
multi armed bandits
multi armed bandit
online learning
lower bound
linear regression
stochastic systems
multi armed bandit problems
upper bound
expert advice
bandit problems
case study
online convex optimization
linear systems
test bed
support vector machine
np hard
neural network