AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration.
Ji LinJiaming TangHaotian TangShang YangWei-Ming ChenWei-Chen WangGuangxuan XiaoXingyu DangChuang GanSong HanPublished in: MLSys (2024)
Keyphrases
- lossy image compression
- quantization noise
- efficient compression
- uniform quantization
- image compression
- data compression
- compression scheme
- huffman coding
- compression ratio
- information processing
- compression algorithm
- quantization error
- transform coding
- entropy coding
- quantization scheme
- block coding
- data acquisition
- wavelet image coding
- lookup table
- neural network
- bit allocation
- activation detection
- adaptive quantization
- weighting scheme
- bit rate