ULPPACK: Fast Sub-8-bit Matrix Multiply on Commodity SIMD Hardware.
Jaeyeon WonJeyeon SiSam SonTae Jun HamJae W. LeePublished in: MLSys (2022)
Keyphrases
- massively parallel
- parallel architectures
- random number generator
- commodity hardware
- low cost
- hardware and software
- real time
- parallel processing
- parallel algorithm
- protection scheme
- singular value decomposition
- linear algebra
- highly parallel
- parallel computing
- image processing
- vlsi implementation
- general purpose
- floating point
- multi core processors
- floating point unit
- integer arithmetic
- computer systems
- address space
- covariance matrix
- low rank
- hardware implementation
- singular values
- computing systems
- processing elements
- hardware architecture
- fine grained
- magnetic tape
- graphics processing units
- single commodity
- parallel implementation