Sign in

A Novel Tensor-Expert Hybrid Parallelism Approach to Scale Mixture-of-Experts Training.

Siddharth SinghOlatunji RuwaseAmmar Ahmad AwanSamyam RajbhandariYuxiong HeAbhinav Bhatele
Published in: CoRR (2023)
Keyphrases