Improved algorithms for bandit with graph feedback via regret decomposition.
Yuchen HeChihao ZhangPublished in: Theor. Comput. Sci. (2023)
Keyphrases
- graph theory
- worst case
- bandit problems
- upper confidence bound
- lower bound
- online learning
- random walk
- contextual bandit
- multi armed bandit
- depth first search
- regret minimization
- strongly connected
- expert advice
- regret bounds
- maximum flow
- tree decomposition
- graph structures
- learning algorithm
- graph data
- graph mining
- graph structure
- data structure
- reinforcement learning
- e learning