Login / Signup
Acceleration of Large Transformer Model Training by Sensitivity-Based Layer Dropping.
Yujie Zeng
Wenlong He
Ihor Vasyltsov
Jiali Pang
Lin Chen
Published in:
AAAI (2023)
Keyphrases
</>
computational model
prior knowledge
statistical model
probability distribution
em algorithm
mathematical model
simulation model
supervised learning
experimental data
theoretical analysis
multi layer
neural network model
training samples
feature selection
training set
image sequences
decision trees