Parallel Huge Matrix Multiplication on a Cluster with GPGPU Accelerators.
Seungyo RyuDongseung KimPublished in: IPDPS Workshops (2018)
Keyphrases
- matrix multiplication
- distributed memory
- graphics processing units
- general purpose
- shared memory
- parallel implementation
- message passing
- parallel programming
- clustering algorithm
- matrix factorization
- parallel processing
- floating point
- massively parallel
- computing systems
- parallel computing
- single chip
- efficient implementation
- high performance computing
- parallel architectures
- compute unified device architecture