A Distributed Algorithm for Multi-Armed Bandit with Homogeneous Rewards over Directed Graphs.
Jingxuan ZhuJi LiuPublished in: ACC (2021)
Keyphrases
- directed graph
- learning algorithm
- computational complexity
- expectation maximization
- dynamic programming
- maximum flow
- probabilistic model
- multi agent
- worst case
- objective function
- undirected graph
- reinforcement learning
- decision trees
- multi armed bandit
- optimal solution
- active learning
- np hard
- machine learning
- em algorithm
- multi armed bandits