Login / Signup
Reducing Activation Recomputation in Large Transformer Models.
Vijay Anand Korthikanti
Jared Casper
Sangkug Lym
Lawrence McAfee
Michael Andersch
Mohammad Shoeybi
Bryan Catanzaro
Published in:
MLSys (2023)
Keyphrases
</>
probabilistic model
fuzzy logic
fault diagnosis
statistical model
experimental data
statistical models
database
data sets
genetic algorithm
computer vision
website
maximum likelihood
information processing
parameter estimation
complex systems
modeling framework