Transformers with Learnable Activation Functions.

Haishuo Fang Ji-Ung Lee Nafise Sadat Moosavi Iryna Gurevych

Published in: CoRR (2022)

Keyphrases

activation function
neural network
feed forward
artificial neural networks
hidden layer
back propagation
learning rate
feed forward neural networks
neural nets
neural architecture
radial basis function
network architecture
multilayer perceptron
basis functions
hidden nodes
learning algorithm
training phase
multi layer perceptron
fuzzy neural network
recurrent neural networks
convergence rate
small number
support vector
machine learning