Combining learning rate decay and weight decay with complexity gradient descent - Part I.
Pierre H. RichemondYike GuoPublished in: CoRR (2019)
Keyphrases
- learning rate
- error function
- natural gradient
- learning algorithm
- convergence rate
- update rule
- hidden layer
- adaptive learning rate
- convergence speed
- activation function
- training algorithm
- weight vector
- rapid convergence
- cost function
- convergence theorem
- objective function
- bp neural network algorithm
- delta bar delta
- machine learning
- high accuracy
- multilayer neural networks