Cache Blocking of Distributed-Memory Parallel Matrix Power Kernels.
Dane C. LaceyChristie L. AlappatFlorian LangeGeorg HagerHolger FehskeGerhard WelleinPublished in: CoRR (2024)
Keyphrases
- distributed memory
- multithreading
- matrix multiplication
- shared memory
- parallel implementation
- multiprocessor systems
- positive definite
- ibm sp
- single processor
- scientific computing
- data parallelism
- parallel computers
- fine grain
- positive semidefinite
- parallel computing
- parallel architecture
- data partitioning
- kernel function
- multi core processors
- multi processor
- parallel machines
- parallel algorithm
- singular value decomposition
- power consumption
- computational complexity
- parallel computation
- computational power
- message passing
- response time
- query processing