Implementing a Code Generator for Fast Matrix Multiplication in OpenCL on the GPU.
Kazuya MatsumotoNaohito NakasatoStanislav G. SedukhinPublished in: MCSoC (2012)
Keyphrases
- matrix multiplication
- code generator
- graphics processing units
- code generation
- parallel programming
- distributed memory
- shared memory
- message passing
- automatically generated
- process model
- parallel computing
- parallel implementation
- general purpose
- efficient implementation
- parallel processing
- matrix factorization
- parallel computation
- floating point
- parallel algorithm
- software development
- data management
- web services
- higher order
- knowledge management
- distributed systems
- high performance computing
- massively parallel
- model driven
- graph cuts
- modeling language
- computing systems