Login / Signup
Cooperative and Stochastic Multi-Player Multi-Armed Bandit: Optimal Regret With Neither Communication Nor Collisions.
Sébastien Bubeck
Thomas Budzinski
Mark Sellke
Published in:
CoRR (2020)
Keyphrases
</>
multi armed bandit
regret bounds
cooperative
multi armed bandits
multi player
reinforcement learning
online game
lower bound
online learning
information sharing
linear regression
game playing
multi agent systems
least squares
worst case
educational games