Transformers with Learnable Activation Functions.
Haishuo FangJi-Ung LeeNafise Sadat MoosaviIryna GurevychPublished in: EACL (Findings) (2023)
Keyphrases
- activation function
- neural network
- artificial neural networks
- feed forward
- neural nets
- hidden layer
- back propagation
- neural architecture
- feed forward neural networks
- learning rate
- multilayer perceptron
- basis functions
- radial basis function
- learning algorithm
- hidden nodes
- network architecture
- fuzzy neural network
- autoregressive
- training phase
- multi layer perceptron
- probabilistic model