Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size.

Davis Yoshida Allyson Ettinger Kevin Gimpel

Published in: CoRR (2020)

Keyphrases

hidden markov models
computational complexity
contextual information
context aware
neural network
scales linearly
maintenance cost
maximum number
highly efficient
improved algorithm
statistical methods
computational efficiency
artificial neural networks
relational databases
natural language
database systems
genetic algorithm