A high-performance matrix-matrix multiplication methodology for CPU and GPU architectures.
Vasilios I. KelefourasAngeliki KritikakouIosif MporasVasileios KoloniasPublished in: J. Supercomput. (2016)
Keyphrases
- matrix multiplication
- graphics processing units
- distributed memory
- parallel implementation
- heterogeneous computing
- message passing
- multithreading
- shared memory
- parallel computing
- graphics processors
- general purpose
- gpu implementation
- parallel computers
- compute intensive
- matrix factorization
- graphics hardware
- parallel architectures
- pc cluster
- compute unified device architecture
- scientific computing
- parallel programming
- high performance computing
- parallel machines
- multi core processors
- real time
- multistage
- special case
- memory bandwidth
- image processing
- computer vision