Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models.
James O'NeillSourav DuttaPublished in: CoRR (2023)
Keyphrases
- language model
- achieving high
- compression rate
- quantization error
- language modeling
- lossless compression
- video compression
- visual quality
- compression ratio
- n gram
- storage requirements
- speech recognition
- compressed images
- probabilistic model
- language modelling
- retrieval model
- high compression
- information retrieval
- test collection
- query expansion
- statistical language models
- language models for information retrieval
- sampling rate
- lossy compression
- motion compensation
- visual words
- smoothing methods
- bit rate
- image compression
- data compression
- high frequency
- image quality
- multiresolution