KroneckerBERT: Significant Compression of Pre-trained Language Models Through Kronecker Decomposition and Knowledge Distillation.
Marzieh S. TahaeiElla CharlaixVahid Partovi NiaAli GhodsiMehdi RezagholizadehPublished in: NAACL-HLT (2022)
Keyphrases
- language model
- language modeling
- pre trained
- probabilistic model
- n gram
- retrieval model
- document retrieval
- speech recognition
- test collection
- information retrieval
- statistical language models
- language modelling
- query expansion
- language models for information retrieval
- multi modal
- data points
- decision trees
- document ranking
- smoothing methods
- machine learning