Accelerating Sparse Matrix-Matrix Multiplication with GPU Tensor Cores.
Orestis ZachariadisNitin SatputeJuan Gómez-LunaJoaquín OlivaresPublished in: CoRR (2020)
Keyphrases
- sparse matrix
- matrix multiplication
- distributed memory
- message passing
- parallel implementation
- high order
- gpu implementation
- random projections
- higher order
- matrix factorization
- graphics processing units
- floating point
- diffusion tensor
- dimensionality reduction
- image processing
- parallel computing
- rows and columns
- collaborative filtering