Login / Signup
One Teacher is Enough? Pre-trained Language Model Distillation from Multiple Teachers.
Chuhan Wu
Fangzhao Wu
Yongfeng Huang
Published in:
CoRR (2021)
Keyphrases
</>
language model
language modeling
professional development
pre trained
n gram
document retrieval
speech recognition
ad hoc information retrieval
probabilistic model
query expansion
information retrieval
test collection
retrieval model
learning process
context sensitive
learning environment
smoothing methods
mixture model
translation model