Login / Signup
System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models.
Sam Ade Jacobs
Masahiro Tanaka
Chengming Zhang
Minjia Zhang
Reza Yazdani Aminadabi
Shuaiwen Leon Song
Samyam Rajbhandari
Yuxiong He
Published in:
PODC (2024)
Keyphrases
</>
decision trees
statistical models
genetic algorithm
probabilistic model
model selection
training examples
artificial intelligence
prior knowledge
least squares
online learning
complex systems
computational models
autoregressive