Login / Signup

Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models.

Tianwen WeiBo ZhuLiang ZhaoCheng ChengBiye LiWeiwei LüPeng ChengJianhao ZhangXiaoyu ZhangLiang ZengXiaokun WangYutuan MaRui HuShuicheng YanHan FangYahui Zhou
Published in: CoRR (2024)
Keyphrases