Non-convergence of stochastic gradient descent in the training of deep neural networks.

Patrick Cheridito Arnulf Jentzen Florian Rossmannek

Published in: J. Complex. (2021)

Keyphrases

stochastic gradient descent
neural network
step size
early stopping
least squares
loss function
matrix factorization
training speed
convergence rate
random forests
support vector machine
training process
weight vector
training algorithm
convergence speed
regularization parameter
online algorithms
genetic algorithm
linear svm
multiple kernel learning
back propagation
cross validation
online learning