One Teacher is Enough? Pre-trained Language Model Distillation from Multiple Teachers.
Chuhan WuFangzhao WuYongfeng HuangPublished in: CoRR (2021)
Keyphrases
- language model
- language modeling
- professional development
- pre trained
- n gram
- document retrieval
- speech recognition
- ad hoc information retrieval
- probabilistic model
- query expansion
- information retrieval
- test collection
- retrieval model
- learning process
- context sensitive
- learning environment
- smoothing methods
- mixture model
- translation model