I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models.

Published in: CoRR (2024)

Keyphrases