Login / Signup
Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits.
Haipeng Luo
Mengxiao Zhang
Peng Zhao
Zhi-Hua Zhou
Published in:
CoRR (2022)
Keyphrases
</>
regret bounds
multi armed bandits
multi armed bandit
lower bound
multi armed bandit problems
online learning
linear regression
stochastic systems
expert advice
upper bound
bandit problems
case study
loss function
bregman divergences
linear predictors
test bed
worst case
reinforcement learning