C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Coordination without communication: optimal regret in two players multi-armed bandits.
Sébastien Bubeck
Thomas Budzinski
Published in:
CoRR (2020)
Keyphrases
</>
multi armed bandits
multi armed bandit
bandit problems
worst case
information sharing
game theory
online learning
cooperative
multiagent systems
optimal solution
multi agent
support vector machine
dynamic programming
decision making
loss function
multi agent systems
optimal strategy
reinforcement learning