GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model.
Shicheng TanWeng Lam TamYuanchun WangWenwen GongShu ZhaoPeng ZhangJie TangPublished in: ACL (industry) (2023)
Keyphrases
- language model
- probabilistic model
- general knowledge
- language modeling
- pre trained
- information retrieval
- document retrieval
- n gram
- retrieval model
- test collection
- speech recognition
- data mining
- query expansion
- generative model
- supervised learning
- vector space
- prior knowledge
- feature space
- reinforcement learning
- data sets