Sign in

Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?

Cheng ZhangJianyi ChengIlia ShumailovGeorge A. ConstantinidesYiren Zhao
Published in: CoRR (2023)
Keyphrases