GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training.
Thomas PaineHailin JinJianchao YangZhe LinThomas S. HuangPublished in: ICLR (Workshop Poster) (2014)
Keyphrases
- neural network training
- stochastic gradient descent
- least squares
- loss function
- training algorithm
- neural network
- matrix factorization
- step size
- support vector machine
- random forests
- optimization method
- particle swarm optimisation
- online algorithms
- multiple kernel learning
- importance sampling
- weight vector
- regularization parameter
- cost function
- feature vectors
- active learning
- genetic algorithm
- online learning
- convergence speed
- back propagation
- particle swarm optimization