Transformer Feed-Forward Layers Are Key-Value Memories.
Mor GevaRoei SchusterJonathan BerantOmer LevyPublished in: CoRR (2020)
Keyphrases
- feed forward
- back propagation
- neural nets
- recurrent neural networks
- neural network
- artificial neural networks
- neural architecture
- artificial neural
- hidden layer
- biologically plausible
- activation function
- feed forward neural networks
- visual cortex
- fuzzy logic
- real time
- knowledge base
- neuron model
- recurrent networks
- primate visual cortex
- multi layer
- fault diagnosis