Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapping.
Chenyu JiangYe TianZhen JiaShuai ZhengChuan WuYida WangPublished in: CoRR (2024)
Keyphrases
- communication systems
- training set
- graph representation
- graph model
- weighted graph
- graph structure
- overlapping communities
- subgraph isomorphism
- data sets
- directed graph
- training examples
- mixture model
- hearing impaired
- communities in social networks
- online learning
- random walk
- communication channels
- test set
- bipartite graph
- graph matching
- communication networks
- directed acyclic graph
- training samples
- graph databases
- undirected graph
- graph theory
- graph based algorithm
- graphical models