Login / Signup
Improving blocked matrix-matrix multiplication routine by utilizing AVX-512 instructions on intel knights landing and xeon scalable processors.
Yoosang Park
Raehyun Kim
Thi My Tuyen Nguyen
Jaeyoung Choi
Published in:
Clust. Comput. (2023)
Keyphrases
</>
matrix multiplication
distributed memory
message passing
shared memory
matrix factorization
multi core processors
parallel processing
parallel implementation
parallel algorithm
belief propagation
multiprocessor architecture
parallel machines
single instruction multiple data