Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks.
Steffen EgerPaul YoussefIryna GurevychPublished in: CoRR (2019)
Keyphrases
- deep learning
- activation function
- unsupervised learning
- natural language processing
- neural network
- machine learning
- feed forward
- information extraction
- weakly supervised
- neural nets
- artificial neural networks
- learning rate
- radial basis function
- image segmentation
- computer vision
- question answering
- pairwise
- higher order
- natural language
- active learning
- prior knowledge
- high dimensional