Login / Signup

Knowledge Distillation vs. Pretraining from Scratch under a Fixed (Computation) Budget.

Minh Duc BuiFabian David SchmidtGoran GlavasKatharina von der Wense
Published in: CoRR (2024)
Keyphrases