Plateau Phenomenon in Gradient Descent Training of ReLU networks: Explanation, Quantification and Avoidance.
Mark AinsworthYeonjong ShinPublished in: CoRR (2020)
Keyphrases
- cost function
- social networks
- network structure
- echo state networks
- recurrent networks
- network size
- loss function
- supervised learning
- training samples
- network analysis
- training process
- training phase
- network design
- objective function
- online learning
- hidden markov models
- complex networks
- community detection
- stochastic gradient descent
- knowledge base
- feature selection