On the Generalization Benefit of Noise in Stochastic Gradient Descent.
Samuel L. SmithErich ElsenSoham DePublished in: ICML (2020)
Keyphrases
- stochastic gradient descent
- least squares
- loss function
- random forests
- matrix factorization
- step size
- noise level
- support vector machine
- noise reduction
- importance sampling
- multiple kernel learning
- missing data
- learning algorithm
- cross validation
- regularization parameter
- text categorization
- linear combination
- feature space