Local SGD Accelerates Convergence by Exploiting Second Order Information of the Loss Function.

Linxuan Pan Shenghui Song

Published in: CoRR (2023)

Keyphrases

loss function
information extraction
stochastic gradient descent
support vector
pairwise
semi supervised learning