Tensor Contractions with Extended BLAS Kernels on CPU and GPU.
Yang ShiU. N. NiranjanAnimashree AnandkumarCris CeckaPublished in: CoRR (2016)
Keyphrases
- graphics processing units
- gpu implementation
- graphics processors
- high order
- higher order
- general purpose
- support vector
- parallel processing
- data transfer
- heterogeneous computing
- parallel computation
- kernel function
- linear combination
- parallel implementation
- diffusion tensor
- linear algebra
- real time
- kernel methods
- scale space
- highly optimized
- image processing