Login / Signup

DEFT: Exploiting Gradient Norm Difference between Model Layers for Scalable Gradient Sparsification.

Daegun YoonSangyoon Oh
Published in: CoRR (2023)
Keyphrases