Login / Signup
Exploration for Free: How Does Reward Heterogeneity Improve Regret in Cooperative Multi-agent Bandits?
Xuchuang Wang
Lin Yang
Yu-Zhen Janice Chen
Xutong Liu
Mohammad Hajiesmaili
Don Towsley
John C. S. Lui
Published in:
UAI (2023)
Keyphrases
</>
bandit problems
case study
reinforcement learning
online learning
cooperative multi agent
artificial intelligence
game theory
expected reward
multi armed bandit
real time
lower bound