Fast Parallel Algorithms for Blocked Dense Matrix Multiplication on Shared Memory Architectures.
Gideon NimakoEkow J. OtooDaniel Ohene-KwofiePublished in: ICA3PP (1) (2012)
Keyphrases
- shared memory
- matrix multiplication
- parallel algorithm
- distributed memory
- parallel architectures
- parallel computers
- message passing
- interconnection networks
- heterogeneous platforms
- parallel programming
- parallel computing
- parallel computation
- parallel machines
- data partitioning
- shared memory multiprocessors
- multithreading
- parallel execution
- multi core processors
- high quality
- three dimensional
- computer vision