Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup.
Sebastian GoldtMadhu AdvaniAndrew M. SaxeFlorent KrzakalaLenka ZdeborováPublished in: NeurIPS (2019)
Keyphrases
- stochastic gradient descent
- neural network
- teacher student
- least squares
- loss function
- step size
- matrix factorization
- random forests
- genetic algorithm
- support vector machine
- multiple kernel learning
- weight vector
- regularization parameter
- importance sampling
- online algorithms
- professional development
- online learning
- feature selection
- classroom learning