Constant or logarithmic regret in asynchronous multiplayer bandits.
Hugo RichardEtienne BoursierVianney PerchetPublished in: CoRR (2023)
Keyphrases
- regret bounds
- lower bound
- online learning
- linear regression
- expert advice
- upper bound
- multi armed bandit
- worst case
- multi armed bandits
- multi armed bandit problems
- computer games
- online game
- least squares
- bregman divergences
- online convex optimization
- linear predictors
- binary classification
- game play
- role playing game
- bandit problems
- content analysis