Login / Signup

Universal-KD: Attention-based Output-Grounded Intermediate Layer Knowledge Distillation.

Yimeng WuMehdi RezagholizadehAbbas GhaddarMd. Akmal HaidarAli Ghodsi
Published in: EMNLP (1) (2021)
Keyphrases