• search
    search
  • reviewers
    reviewers
  • feeds
    feeds
  • assignments
    assignments
  • settings
  • logout

DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models.

Sam Ade JacobsMasahiro TanakaChengming ZhangMinjia ZhangShuaiwen Leon SongSamyam RajbhandariYuxiong He
Published in: CoRR (2023)
Keyphrases
  • neural network
  • probabilistic model
  • training set
  • fuzzy logic
  • complex systems
  • feature space
  • structured prediction