Login / Signup
On the Optimization and Generalization of Multi-head Attention.
Puneesh Deora
Rouzbeh Ghaderi
Hossein Taheri
Christos Thrampoulidis
Published in:
Trans. Mach. Learn. Res. (2024)
Keyphrases
</>
optimization algorithm
global optimization
optimization problems
real time
bi level
visual attention
optimization methods
constrained optimization
case study
artificial neural networks
optimization method
optimal design
discrete optimization