Login / Signup
TSM2: optimizing tall-and-skinny matrix-matrix multiplication on GPUs.
Jieyang Chen
Nan Xiong
Xin Liang
Dingwen Tao
Sihuan Li
Kaiming Ouyang
Kai Zhao
Nathan DeBardeleben
Qiang Guan
Zizhong Chen
Published in:
ICS (2019)
Keyphrases
</>
matrix multiplication
message passing
matrix factorization
distributed memory
general purpose
computer vision
software engineering
graph cuts