Login / Signup

Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapping.

Chenyu JiangYe TianZhen JiaShuai ZhengChuan WuYida Wang
Published in: CoRR (2024)
Keyphrases