Login / Signup
Reducing BERT Pre-Training Time from 3 Days to 76 Minutes.
Yang You
Jing Li
Jonathan Hseu
Xiaodan Song
James Demmel
Cho-Jui Hsieh
Published in:
CoRR (2019)
Keyphrases
</>
training process
supervised learning
training algorithm
data mining
database
databases
neural network
real world
metadata
website
multi agent
active learning
training examples
back propagation
test set
training phase