Sign in

Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention.

Tong YuRuslan KhalitovLei ChengZhirong Yang
Published in: CVPR (2022)
Keyphrases
  • dot product
  • positive semi definite
  • kernel function
  • high dimensional
  • scalar product
  • similarity function
  • gaussian kernels
  • sparse representation
  • machine learning
  • training data
  • support vector
  • feature space