Login / Signup
LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers.
Dacheng Li
Rulin Shao
Anze Xie
Eric P. Xing
Joseph E. Gonzalez
Ion Stoica
Xuezhe Ma
Hao Zhang
Published in:
CoRR (2023)
Keyphrases
</>
level parallelism
training set
distributed systems
artificial intelligence
instruction set
neural network
low cost
sufficient conditions
computer systems
fine grained
mobile agents