A disciplined approach to neural network hyper-parameters: Part 1 - learning rate, batch size, momentum, and weight decay.
Leslie N. SmithPublished in: CoRR (2018)
Keyphrases
- learning rate
- hyperparameters
- batch size
- adaptive learning rate
- neural network
- model selection
- batch mode
- incremental learning
- cross validation
- closed form
- gaussian process
- poisson process
- bayesian framework
- bayesian inference
- random sampling
- support vector
- learning algorithm
- sample size
- em algorithm
- prior information
- convergence rate
- convergence speed
- maximum a posteriori
- batch processing
- noise level
- single item
- maximum likelihood
- incomplete data
- artificial neural networks
- parameter space
- linear classifiers
- training set
- data sets
- semi supervised