Athena: Efficient Block-Wise Post-Training Quantization for Large Language Models Using Second-Order Matrix Derivative Information.

Published in: CoRR (2024)

Keyphrases