Login / Signup
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models.
Sam Ade Jacobs
Masahiro Tanaka
Chengming Zhang
Minjia Zhang
Shuaiwen Leon Song
Samyam Rajbhandari
Yuxiong He
Published in:
CoRR (2023)
Keyphrases
</>
neural network
probabilistic model
training set
fuzzy logic
complex systems
feature space
structured prediction