Gradient descent optimizes over-parameterized deep ReLU networks.

Difan Zou Yuan Cao Dongruo Zhou Quanquan Gu

Published in: Mach. Learn. (2020)

Keyphrases

cost function
network size
social networks
complex networks
network structure
loss function
information diffusion
heterogeneous networks
network design
neural network
complex systems
back propagation
least squares
multi agent
case study
deep learning
information retrieval