Login / Signup
Reducing Activation Recomputation in Large Transformer Models.
Vijay Korthikanti
Jared Casper
Sangkug Lym
Lawrence McAfee
Michael Andersch
Mohammad Shoeybi
Bryan Catanzaro
Published in:
CoRR (2022)
Keyphrases
</>
statistical models
data sets
databases
decision trees
probabilistic model
database
data warehouse
statistical model
fuzzy logic
parametric models
bayesian framework
experimental data
complex systems
parameter estimation
learning algorithm
neural network
real time