Login / Signup
Non-convergence of stochastic gradient descent in the training of deep neural networks.
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
Published in:
CoRR (2020)
Keyphrases
</>
stochastic gradient descent
neural network
least squares
loss function
step size
early stopping
matrix factorization
training speed
convergence rate
random forests
training algorithm
training process
convergence speed
support vector machine
weight vector
regularization parameter
back propagation
multiple kernel learning
importance sampling
online algorithms
linear svm
supervised learning
iterative algorithms
support vector
collaborative filtering
training data
learning rate
logistic regression
online learning