We Don't Need No Adam, All We Need Is EVE: On The Variance of Dual Learning Rate And Beyond.
Afshin KhadangiPublished in: CoRR (2023)
Keyphrases
- learning rate
- convergence rate
- learning algorithm
- error function
- adaptive learning rate
- rapid convergence
- hidden layer
- convergence speed
- primal dual
- multilayer neural networks
- weight vector
- training algorithm
- convergence theorem
- training data
- variance reduction
- differential evolution
- evolutionary algorithm
- bp neural network algorithm
- delta bar delta