Login / Signup
How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives.
Xinpeng Wang
Leonie Weissweiler
Hinrich Schütze
Barbara Plank
Published in:
ACL (2) (2023)
Keyphrases
</>
databases
three dimensional
bayesian networks
weight assignment
neural network
data mining
machine learning
objective function
cooperative
trade off
medical images
high impact