Training and inference of large language models using 8-bit floating point.

Sergio P. PerezYan ZhangJames BriggsCharlie BlakeJosh Levy-KramerPaul BalancaCarlo LuschiStephen BarlowAndrew William Fitzgibbon
Published in: CoRR (2023)
Keyphrases