What Algorithms can Transformers Learn? A Study in Length Generalization.
Hattie ZhouArwen BradleyEtai LittwinNoam RazinOmid SaremiJosh M. SusskindSamy BengioPreetum NakkiranPublished in: CoRR (2023)
Keyphrases
- orders of magnitude
- recently developed
- learning algorithm
- image processing
- decision trees
- computational efficiency
- data structure
- optimization problems
- statistical analysis
- graph theory
- real time
- theoretical framework
- experimental study
- computationally efficient
- worst case
- computational cost
- evolutionary algorithm
- bayesian networks
- case study
- social networks