DaSGD: Squeezing SGD Parallelization Performance in Distributed Training Using Delayed Averaging.

Published in: CoRR (2020)

Keyphrases