Uncovering mesa-optimization algorithms in Transformers.
Johannes von OswaldEyvind NiklassonMaximilian SchlegelSeijin KobayashiNicolas ZucchetNino ScherrerNolan MillerMark SandlerBlaise Agüera y ArcasMax VladymyrovRazvan PascanuJoão SacramentoPublished in: CoRR (2023)
Keyphrases
- optimization problems
- computationally efficient
- computational cost
- orders of magnitude
- optimization approaches
- real time
- recently developed
- combinatorial optimization
- computational efficiency
- data mining algorithms
- data structure
- case study
- learning algorithm
- scheduling problem
- significant improvement
- optimization method
- discrete optimization