Login / Signup

Reducing Activation Recomputation in Large Transformer Models.

Vijay KorthikantiJared CasperSangkug LymLawrence McAfeeMichael AnderschMohammad ShoeybiBryan Catanzaro
Published in: CoRR (2022)
Keyphrases