Softmax Output Approximation for Activation Memory-Efficient Training of Attention-based Networks.
Changhyeon LeeSeulki LeePublished in: NeurIPS (2023)
Keyphrases
- approximation algorithms
- memory efficient
- external memory
- social networks
- recurrent networks
- iterative deepening
- echo state networks
- training set
- multiple sequence alignment
- network structure
- training process
- supervised learning
- neural network
- data sets
- information processing
- rbf network
- training samples
- artificial neural networks
- search algorithm
- training data