Login / Signup
A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models.
Takuma Udagawa
Aashka Trivedi
Michele Merler
Bishwaranjan Bhattacharjee
Published in:
EMNLP (Industry Track) (2023)
Keyphrases
</>
language model
language modeling
n gram
probabilistic model
speech recognition
smoothing methods
search engine
document retrieval