A Heterogeneous Accelerated Matrix Multiplication: OpenCL + APU + GPU+ Fast Matrix Multiply
Paolo D'AlbertoPublished in: CoRR (2012)
Keyphrases
- matrix multiplication
- graphics processing units
- message passing
- shared memory
- distributed memory
- floating point
- parallel computing
- parallel programming
- real time
- parallel implementation
- parallel computation
- matrix factorization
- general purpose
- parallel algorithm
- gpu implementation
- belief propagation
- computing systems
- cloud computing
- graphics processors
- graphics hardware
- parallel architectures
- heterogeneous computing
- high performance computing
- massively parallel
- markov random field
- computational complexity