Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback Graphs.

Haipeng Luo Hanghang Tong Mengxiao Zhang Yuheng Zhang

Published in: ALT (2023)

Keyphrases

multi armed bandit
wide range
online learning
regret bounds
neural network
lower bound
probability distribution
relevance feedback
graph theory
multi agent
graph representation
graph data
multi armed bandit problems