A High-Performance Accelerator for Floating-Point Matrix Multiplication.
Xun JiaGuiming WuXianghui XiePublished in: ISPA/IUCC (2017)
Keyphrases
- floating point
- matrix multiplication
- distributed memory
- parallel implementation
- square root
- fixed point
- message passing
- compute intensive
- floating point unit
- matrix factorization
- sparse matrices
- shared memory
- graphics processing units
- floating point arithmetic
- instruction set
- fast fourier transform
- higher order
- state space
- pairwise
- computer vision