Login / Signup
Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback Graphs.
Haipeng Luo
Hanghang Tong
Mengxiao Zhang
Yuheng Zhang
Published in:
ALT (2023)
Keyphrases
</>
multi armed bandit
wide range
online learning
regret bounds
neural network
lower bound
probability distribution
relevance feedback
graph theory
multi agent
graph representation
graph data
multi armed bandit problems