Improving Performance of Matrix Multiplication and FFT on GPU.
Xiang CuiYifeng ChenHong MeiPublished in: ICPADS (2009)
Keyphrases
- matrix multiplication
- message passing
- real time
- fast fourier transform
- matrix factorization
- floating point
- fourier transform
- distributed memory
- graphics processing units
- frequency domain
- graphics hardware
- parallel implementation
- gpu implementation
- graphics processors
- parallel processing
- parallel computing
- gpu accelerated
- fourier transformation
- shared memory
- belief propagation
- web applications
- signal processing
- collaborative filtering
- np hard
- lower bound
- computer vision