LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices.

Published in: CoRR (2024)

Keyphrases