Transformer Feed-Forward Layers Are Key-Value Memories.

Mor Geva Roei Schuster Jonathan Berant Omer Levy

Published in: CoRR (2020)

Keyphrases

feed forward
back propagation
neural nets
recurrent neural networks
neural network
artificial neural networks
neural architecture
artificial neural
hidden layer
biologically plausible
activation function
feed forward neural networks
visual cortex
fuzzy logic
real time
knowledge base
neuron model
recurrent networks
primate visual cortex
multi layer
fault diagnosis