LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale.

Tim Dettmers Mike Lewis Younes Belkada Luke Zettlemoyer

Published in: CoRR (2022)

Keyphrases

matrix multiplication
message passing
scale space
matrix factorization
distributed memory
special case
low resolution
probabilistic model
magnetic tape