On the Optimization and Generalization of Multi-head Attention.
Puneesh DeoraRouzbeh GhaderiHossein TaheriChristos ThrampoulidisPublished in: CoRR (2023)
Keyphrases
- optimization model
- global optimization
- optimization problems
- discrete optimization
- optimization process
- real time
- head pose estimation
- information retrieval
- optimization algorithm
- joint optimization
- optimization methods
- visual attention
- viewpoint
- feature selection
- information systems
- search engine
- learning algorithm
- neural network