A Guide for Achieving High Performance with Very Small Matrices on GPU: A Case Study of Batched LU and Cholesky Factorizations.

Published in: IEEE Trans. Parallel Distributed Syst. (2018)

Keyphrases