Login / Signup
On the Choice of Learning Rate for Local SGD.
Lukas Balles
Prabhu Teja Sivaprasad
Cédric Archambeau
Published in:
Trans. Mach. Learn. Res. (2024)
Keyphrases
</>
learning rate
convergence rate
learning algorithm
training speed
weight vector
error function
update rule
rapid convergence
convergence speed
stochastic gradient descent
adaptive learning rate
hidden layer
multilayer neural networks
neural network
training algorithm
delta bar delta