Gradient-based Intra-attention Pruning on Pre-trained Language Models.
Ziqing YangYiming CuiXin YaoShijin WangPublished in: ACL (1) (2023)
Keyphrases
- language model
- pre trained
- language modeling
- n gram
- probabilistic model
- retrieval model
- training data
- speech recognition
- query expansion
- training examples
- document retrieval
- information retrieval
- statistical language models
- test collection
- language modelling
- language models for information retrieval
- smoothing methods
- document ranking
- relevance model
- focus of attention
- visual features
- clustering algorithm