Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent.
David HolzmüllerIngo SteinwartPublished in: CoRR (2020)
Keyphrases
- recurrent networks
- training set
- echo state networks
- supervised learning
- online learning
- training examples
- heterogeneous networks
- network structure
- data sets
- complex networks
- stochastic gradient descent
- network design
- loss function
- objective function
- social networks
- community structure
- community detection
- network model
- training algorithm
- multi layer perceptron
- complex systems
- test set
- conjugate gradient