Publication: A nonmonotone learning rate strategy for SGD training of deep neural networks.