Login / Signup
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers.
Zhuohan Li
Eric Wallace
Sheng Shen
Kevin Lin
Kurt Keutzer
Dan Klein
Joseph E. Gonzalez
Published in:
CoRR (2020)
Keyphrases
</>
computational model
probabilistic model
computational complexity
high level
statistical model
prior knowledge
data sets
experimental data
prediction model
management system
maximum likelihood
multi agent systems
data compression
formal model
random fields
decision theoretic
structured prediction