A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions.
Arnulf JentzenAdrian RiekertPublished in: CoRR (2021)
Keyphrases
- stochastic gradient descent
- artificial neural networks
- step size
- least squares
- loss function
- early stopping
- matrix factorization
- training speed
- random forests
- convergence rate
- neural network
- online algorithms
- regularization parameter
- convergence speed
- support vector machine
- weight vector
- linear svm
- genetic algorithm ga
- importance sampling
- pairwise
- training set
- training data