Scalable communication for high-order stencil computations using CUDA-aware MPI.
Johannes PekkiläMiikka S. VäisäläMaarit J. KäpyläMatthias RheinhardtOskar LappiPublished in: Parallel Comput. (2022)
Keyphrases
- high order
- higher order
- low order
- pairwise
- general purpose
- lower order
- low rank
- bayesian logistic regression
- parallel computing
- partial differential equations
- parallel implementation
- zernike moments
- high performance computing
- message passing
- shared memory
- massively parallel
- parallel computation
- fourth order
- message passing interface
- tensor decomposition
- computer vision