MSDF-SGD: Most-Significant Digit-First Stochastic Gradient Descent for Arbitrary-Precision Training.
Changjun SongYongming TangJiyuan LiuSige BianDanni DengHe LiPublished in: FPL (2023)
Keyphrases
- stochastic gradient descent
- least squares
- stochastic gradient
- matrix factorization
- step size
- loss function
- training speed
- early stopping
- random forests
- regularization parameter
- alternating least squares
- weight vector
- multiple kernel learning
- support vector machine
- collaborative filtering
- importance sampling
- online algorithms
- optical flow
- support vector
- decision trees
- feature selection