Constant or logarithmic regret in asynchronous multiplayer bandits.

Hugo Richard Etienne Boursier Vianney Perchet

Published in: CoRR (2023)

Keyphrases

regret bounds
lower bound
online learning
linear regression
expert advice
upper bound
multi armed bandit
worst case
multi armed bandits
multi armed bandit problems
computer games
online game
least squares
bregman divergences
online convex optimization
linear predictors
binary classification
game play
role playing game
bandit problems
content analysis