Login / Signup

On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models.

Sean FarhatDeming Chen
Published in: CoRR (2024)
Keyphrases