Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes.

Takuya Akiba Shuji Suzuki Keisuke Fukuda

Published in: CoRR (2017)

Keyphrases

stochastic gradient descent
linear svm
training set
training phase
hierarchical structure
online learning
data sets
least squares
training examples
training process
step size
image collections
test set
image database
pairwise
lower bound
learning algorithm