Login / Signup

Improving Knowledge Distillation for BERT Models: Loss Functions, Mapping Methods, and Weight Tuning.

Apoorv DankarAdeem JassaniKartikaeya Kumar
Published in: CoRR (2023)
Keyphrases