Implicit Regularization of Stochastic Gradient Descent in Natural Language Processing: Observations and Implications.
Deren LeiZichen SunYijun XiaoWilliam Yang WangPublished in: CoRR (2018)
Keyphrases
- stochastic gradient descent
- natural language processing
- least squares
- step size
- loss function
- regularization parameter
- matrix factorization
- random forests
- machine learning
- early stopping
- information extraction
- online algorithms
- weight vector
- importance sampling
- multiple kernel learning
- support vector machine
- image restoration
- text mining
- learning algorithm
- prediction accuracy
- missing data
- linear combination
- maximum likelihood