Login / Signup
Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection.
Chenglong Wang
Yi Lu
Yongyu Mu
Yimin Hu
Tong Xiao
Jingbo Zhu
Published in:
CoRR (2023)
Keyphrases
</>
language model
speech recognition
language modeling
neural network
decision trees
image sequences
learning process
probabilistic model
graph cuts
test collection
document retrieval