The Implicit Biases of Stochastic Gradient Descent on Deep Neural Networks with Batch Normalization.

Ziquan Liu Yufei Cui Jia Wan Yu Mao Antoni B. Chan

Published in: CoRR (2021)

Keyphrases

stochastic gradient descent
neural network
online algorithms
step size
least squares
matrix factorization
loss function
online learning
random forests
lower bound
training process
support vector machine
worst case
multiple kernel learning
weight vector
genetic algorithm
regularization parameter
importance sampling
collaborative filtering
convergence rate
asymptotically optimal
learning algorithm