Login / Signup

TSM2: optimizing tall-and-skinny matrix-matrix multiplication on GPUs.

Jieyang ChenNan XiongXin LiangDingwen TaoSihuan LiKaiming OuyangKai ZhaoNathan DeBardelebenQiang GuanZizhong Chen
Published in: ICS (2019)
Keyphrases
  • matrix multiplication
  • message passing
  • matrix factorization
  • distributed memory
  • general purpose
  • computer vision
  • software engineering
  • graph cuts