Login / Signup

FineQuant: Unlocking Efficiency with Fine-Grained Weight-Only Quantization for LLMs.

Young Jin KimRawn HenryRaffy FahimHany Hassan Awadalla
Published in: CoRR (2023)
Keyphrases
  • fine grained
  • coarse grained
  • access control
  • tightly coupled
  • massively parallel
  • computational complexity
  • quantization error
  • metadata
  • web services
  • privacy preserving
  • data provenance
  • data lineage