Sign in

Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference.

Ranggi HwangJianyu WeiShijie CaoChangho HwangXiaohu TangTing CaoMao YangMinsoo Rhu
Published in: CoRR (2023)
Keyphrases