Achieving Near-Optimal Individual Regret & Low Communications in Multi-Agent Bandits.
Xuchuang WangLin YangYu-Zhen Janice ChenXutong LiuMohammad HajiesmailiDon TowsleyJohn C. S. LuiPublished in: ICLR (2023)
Keyphrases
- multi agent
- multi armed bandits
- multi agent systems
- heterogeneous agents
- high levels
- online learning
- intelligent agents
- confidence bounds
- cognitive agents
- multi class
- reinforcement learning
- worst case
- machine learning
- communication systems
- state space
- lower bound
- expert advice
- team formation
- oriented programming
- multi armed bandit
- decision trees
- social networks