A GEMM interface and implementation on NVIDIA GPUs for multiple small matrices
Chetan JhuraniPaul MullowneyPublished in: CoRR (2013)
Keyphrases
- graphics processing units
- general purpose
- parallel implementation
- graphics processors
- user friendly
- neural network
- web services
- efficient implementation
- graphics cards
- singular value decomposition
- parallel processing
- graphics hardware
- highly parallel
- gpu implementation
- friendly interface
- single instruction multiple data