Login / Signup

Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty.

Inar TimiryasovJean-Loup Tastet
Published in: CoRR (2023)
Keyphrases