Login / Signup

Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection.

Chenglong WangYi LuYongyu MuYimin HuTong XiaoJingbo Zhu
Published in: CoRR (2023)
Keyphrases
  • language model
  • speech recognition
  • language modeling
  • neural network
  • decision trees
  • image sequences
  • learning process
  • probabilistic model
  • graph cuts
  • test collection
  • document retrieval