Understanding Counting in Small Transformers: The Interplay between Attention and Feed-Forward Layers.
Freya BehrensLuca BiggioLenka ZdeborováPublished in: CoRR (2024)
Keyphrases
- feed forward
- back propagation
- neural nets
- artificial neural networks
- neural network
- recurrent neural networks
- hidden layer
- biologically plausible
- feed forward neural networks
- visual cortex
- recurrent networks
- activation function
- knowledge base
- single layer
- training algorithm
- multi layer
- neural architecture
- spiking neural networks
- multiple layers
- input image
- spiking neurons
- artificial neural