Login / Signup
Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method.
Shicheng Tan
Weng Lam Tam
Yuanchun Wang
Wenwen Gong
Shu Zhao
Peng Zhang
Jie Tang
Published in:
CoRR (2023)
Keyphrases
</>
language model
information retrieval
probabilistic model
prior information
n gram
dependency structure
feature selection
k means
co occurrence
unsupervised learning
language modeling
translation model
smoothing methods