Dynamical Isometry is Achieved in Residual Networks in a Universal Way for any Activation Function.
Wojciech TarnowskiPiotr WarcholStanislaw JastrzebskiJacek TaborMaciej A. NowakPublished in: AISTATS (2019)
Keyphrases
- activation function
- network size
- neural network
- hidden layer
- learning rate
- feed forward
- artificial neural networks
- feed forward neural networks
- back propagation
- network structure
- neural nets
- vector field
- feedforward neural networks
- multilayer perceptron
- network architecture
- bayesian networks
- training data
- social networks
- hidden nodes
- radial basis function
- basis functions
- particle swarm optimization
- high dimensional