• search
    search
  • reviewers
    reviewers
  • feeds
    feeds
  • assignments
    assignments
  • settings
  • logout

Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection.

Chenglong WangYi LuYongyu MuYimin HuTong XiaoJingbo Zhu
Published in: CoRR (2023)
Keyphrases
  • language model
  • speech recognition
  • language modeling
  • neural network
  • decision trees
  • image sequences
  • learning process
  • probabilistic model
  • graph cuts
  • test collection
  • document retrieval