Decentralized Online Bandit Optimization on Directed Graphs with Regret Bounds.

Johan Östman Ather Gattami Daniel Gillblad

Published in: CoRR (2023)

Keyphrases

directed graph
regret bounds
online learning
multi armed bandit
decentralized decision making
random walk
online convex optimization
linear regression
directed acyclic graph
lower bound
graph structure
e learning
markov chain
strongly connected
undirected graph
upper bound
optimal solution