Sign in

SiDA: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models.

Zhixu DuShiyu LiYuhao WuXiangyu JiangJingwei SunQilin ZhengYongkai WuAng LiHai (Helen) LiYiran Chen
Published in: CoRR (2023)
Keyphrases