Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup.
Sebastian GoldtMadhu S. AdvaniAndrew M. SaxeFlorent KrzakalaLenka ZdeborováPublished in: CoRR (2019)
Keyphrases
- stochastic gradient descent
- neural network
- teacher student
- least squares
- matrix factorization
- loss function
- step size
- random forests
- online algorithms
- importance sampling
- online learning
- professional development
- weight vector
- regularization parameter
- genetic algorithm
- multiple kernel learning
- training data
- information retrieval
- data mining
- decision trees