Scalable communication for high-order stencil computations using CUDA-aware MPI.
Johannes PekkiläMiikka S. VäisäläMaarit J. KäpyläMatthias RheinhardtOskar LappiPublished in: CoRR (2021)
Keyphrases
- high order
- higher order
- pairwise
- low order
- markov random field
- parallel implementation
- tensor analysis
- message passing
- parallel computation
- general purpose
- lower order
- multi task learning
- high performance computing
- low rank
- message passing interface
- fourth order
- shared memory
- machine learning
- partial differential equations
- hidden markov models
- decision trees