Direction Matters: On the Implicit Regularization Effect of Stochastic Gradient Descent with Moderate Learning Rate.
Jingfeng WuDifan ZouVladimir BravermanQuanquan GuPublished in: CoRR (2020)
Keyphrases
- stochastic gradient descent
- learning rate
- weight vector
- step size
- convergence rate
- convergence speed
- training speed
- loss function
- learning algorithm
- least squares
- matrix factorization
- regularization parameter
- perceptron algorithm
- pairwise
- support vector machine
- feature space
- hyperplane
- logistic regression
- cost function
- random forests
- feature selection