Login / Signup

MegaScale: Scaling Large Language Model Training to More Than 10, 000 GPUs.

Ziheng JiangHaibin LinYinmin ZhongQi HuangYangrui ChenZhi ZhangYanghua PengXiang LiCong XieShibiao NongYulu JiaSun HeHongmin ChenZhihao BaiQi HouShipeng YanDing ZhouYiyao ShengZhuo JiangHaohan XuHaoran WeiZhang ZhangPengfei NieLeqi ZouSida ZhaoLiang XiangZherui LiuZhe LiXiaoying JiaJianxi YeXin JinXin Liu
Published in: CoRR (2024)
Keyphrases