Sign in

Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method.

Shicheng TanWeng Lam TamYuanchun WangWenwen GongShu ZhaoPeng ZhangJie Tang
Published in: ACL (Findings) (2023)
Keyphrases
  • language model
  • unsupervised learning
  • probabilistic model
  • em algorithm
  • statistical model
  • bayesian networks
  • error rate
  • test collection
  • relevance model
  • statistical language models