Automatically Generating High-performance Matrix Multiplication Kernels on the Latest Sunway Processor.
Xiaohan TaoYu ZhuBoyang WangJinlong XuJianmin PangJie ZhaoPublished in: ICPP (2022)
Keyphrases
- automatically generating
- distributed memory
- matrix multiplication
- shared memory
- parallel implementation
- automatically generated
- parallel computers
- kernel function
- support vector
- cutting edge
- kernel methods
- feature space
- parallel machines
- multiple kernel learning
- pairwise
- parallel processing
- graphical models
- d objects