GPTQT: Quantize Large Language Models Twice to Push the Efficiency.

Yipin GuoYilin LangQinyuan Ren
Published in: CoRR (2024)
Keyphrases