Lipschitzness Effect of a Loss Function on Generalization Performance of Deep Neural Networks Trained by Adam and AdamW Optimizers.
Mohammad LashkariAmin GheibiPublished in: CoRR (2023)
Keyphrases
- loss function
- neural network
- multilayer perceptron
- pairwise
- support vector
- training process
- learning to rank
- multi layer perceptron
- risk minimization
- logistic regression
- artificial neural networks
- regularization term
- reproducing kernel hilbert space
- hinge loss
- empirical risk
- back propagation
- radial basis function
- stochastic gradient descent
- boosting framework
- feature space