Sign in

Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method.

Shicheng TanWeng Lam TamYuanchun WangWenwen GongShu ZhaoPeng ZhangJie Tang
Published in: CoRR (2023)
Keyphrases