Sign in

InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding.

Qiaoling ChenDiandian GuGuoteng WangXun ChenYingTong XiongTing HuangQinghao HuXin JinYonggang WenTianwei ZhangPeng Sun
Published in: CoRR (2024)
Keyphrases