Cache efficient implementation for block matrix operations.
Lukás PolokViorela IlaPavel SmrzPublished in: SpringSim (HPC) (2013)
Keyphrases
- efficient implementation
- coefficient matrix
- rows and columns
- block size
- matrix multiplication
- block wise
- row column
- efficient processing
- highly parallel
- active set
- singular value decomposition
- objective function
- processing units
- hardware implementation
- data access
- negative matrix factorization
- low rank
- matrix factorization
- main memory
- linear programming
- query processing
- pairwise
- clustering algorithm
- block matching motion estimation