Batch Normalization Is Blind to the First and Second Derivatives of the Loss.

Published in: AAAI (2024)

Keyphrases