Login / Signup
On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models.
Sean Farhat
Deming Chen
Published in:
CoRR (2024)
Keyphrases
</>
probabilistic model
image sequences
prior knowledge
data sets
neural network
small number
online learning
parameter estimation
training examples
machine learning algorithms
statistical model
statistical models
modeling framework