Login / Signup
Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention.
Tong Yu
Ruslan Khalitov
Lei Cheng
Zhirong Yang
Published in:
CoRR (2022)
Keyphrases
</>
dot product
positive semi definite
similarity function
kernel function
scalar product
feature space
sparse representation
gaussian kernels
image processing
high dimensional