Transformers with Learnable Activation Functions.
Haishuo FangJi-Ung LeeNafise Sadat MoosaviIryna GurevychPublished in: CoRR (2022)
Keyphrases
- activation function
- neural network
- feed forward
- artificial neural networks
- hidden layer
- back propagation
- learning rate
- feed forward neural networks
- neural nets
- neural architecture
- radial basis function
- network architecture
- multilayer perceptron
- basis functions
- hidden nodes
- learning algorithm
- training phase
- multi layer perceptron
- fuzzy neural network
- recurrent neural networks
- convergence rate
- small number
- support vector
- machine learning