Efficient Training of Large Language Models on Distributed Infrastructures: A Survey.
Jiangfei DuanShuo ZhangZerui WangLijuan JiangWenwen QuQinghao HuGuoteng WangQizhen WengHang YanXingcheng ZhangXipeng QiuDahua LinYonggang WenXin JinTianwei ZhangPeng SunPublished in: CoRR (2024)
Keyphrases