Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models.
James O'NeillSourav DuttaPublished in: ACL (2) (2023)
Keyphrases
- language model
- achieving high
- compression rate
- quantization error
- language modeling
- compression ratio
- high compression
- lossless compression
- visual quality
- n gram
- probabilistic model
- speech recognition
- video compression
- information retrieval
- retrieval model
- storage requirements
- language modelling
- query expansion
- compressed images
- test collection
- statistical language models
- image quality
- image compression
- smoothing methods
- lossy compression
- data compression
- sampling rate
- data structure
- relevance model
- reconstruction error
- language models for information retrieval
- low bit rate
- data hiding
- compression algorithm
- motion compensation
- visual words
- computer vision