Login / Signup

Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs.

Pinxue ZhaoHailin ZhangFangcheng FuXiaonan NieQibin LiuFang YangYuanbo PengDian JiaoShuaipeng LiJinbao XueYangyu TaoBin Cui
Published in: CoRR (2024)
Keyphrases