Login / Signup
Extremely Small BERT Models from Mixed-Vocabulary Training.
Sanqiang Zhao
Raghav Gupta
Yang Song
Denny Zhou
Published in:
EACL (2021)
Keyphrases
</>
probabilistic model
neural network
real world
training set
prior knowledge
experimental data
metadata
pairwise
small number
model selection
parameter estimation
process model
labeled data for training