Login / Signup
Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs.
Pinxue Zhao
Hailin Zhang
Fangcheng Fu
Xiaonan Nie
Qibin Liu
Fang Yang
Yuanbo Peng
Dian Jiao
Shuaipeng Li
Jinbao Xue
Yangyu Tao
Bin Cui
Published in:
CoRR (2024)
Keyphrases
</>
fixed length
test set
general purpose
neural network
training set
training examples
training algorithm
computational power
data sets
machine learning
training data
high quality
parallel processing
training process
variable length
minimal length