One-pass Stochastic Gradient Descent in overparametrized two-layer neural networks.
Hanjing ZhuJiaming XuPublished in: AISTATS (2021)
Keyphrases
- stochastic gradient descent
- neural network
- least squares
- matrix factorization
- loss function
- step size
- random forests
- importance sampling
- weight vector
- regularization parameter
- back propagation
- multiple kernel learning
- support vector machine
- collaborative filtering
- machine learning
- benchmark datasets
- particle swarm optimization
- training process
- pairwise
- support vector
- image processing
- learning algorithm
- online algorithms