Compressing Pre-trained Language Models by Matrix Decomposition.
Matan Ben NoachYoav GoldbergPublished in: AACL/IJCNLP (2020)
Keyphrases
- language model
- pre trained
- matrix decomposition
- low rank
- nonnegative matrix factorization
- training data
- probabilistic model
- training examples
- n gram
- singular value decomposition
- information retrieval
- speech recognition
- matrix factorization
- vector space model
- retrieval model
- linear combination
- query expansion
- small number
- high dimensional data
- missing data
- principal component analysis
- data representation