Implementing Strassen's Algorithm with CUTLASS on NVIDIA Volta GPUs.
Jianyu HuangChenhan D. YuRobert A. van de GeijnPublished in: CoRR (2018)
Keyphrases
- times faster
- learning algorithm
- computational cost
- computational complexity
- preprocessing
- k means
- dynamic programming
- similarity measure
- optimization algorithm
- cost function
- parallel implementation
- detection algorithm
- computationally efficient
- linear programming
- search space
- optimal solution
- simulated annealing
- expectation maximization
- worst case
- segmentation algorithm
- data structure
- matching algorithm
- parallel processing
- objective function