Softmax Output Approximation for Activation Memory-Efficient Training of Attention-based Networks.

Changhyeon Lee Seulki Lee

Published in: NeurIPS (2023)

Keyphrases

approximation algorithms
memory efficient
external memory
social networks
recurrent networks
iterative deepening
echo state networks
training set
multiple sequence alignment
network structure
training process
supervised learning
neural network
data sets
information processing
rbf network
training samples
artificial neural networks
search algorithm
training data