An implementation of matrix-matrix multiplication on the Intel KNL processor with AVX-512.
Roktaek LimYeongha LeeRaehyun KimJaeyoung ChoiPublished in: Clust. Comput. (2018)
Keyphrases
- matrix multiplication
- distributed memory
- computer architecture
- message passing
- parallel implementation
- instruction set
- single instruction multiple data
- parallel architecture
- parallel computers
- matrix factorization
- cell broadband engine architecture
- bayesian networks
- dynamic programming
- level set
- parallel processing
- shared memory