cuFastTuckerPlus: A Stochastic Parallel Sparse FastTucker Decomposition Using GPU Tensor Cores.
Zixuan LiMingxing DuanHuizhang LuoWangdong YangKenli LiKeqin LiPublished in: CoRR (2024)
Keyphrases
- shared memory
- tensor decomposition
- parallel computing
- parallel programming
- parallel architectures
- multi core systems
- multi core processors
- parallel computation
- address space
- distributed memory
- message passing interface
- auxiliary information
- data representation
- high order
- multicore processors
- tensor factorization
- parallel computers
- graphics processing units
- massively parallel
- multi threaded
- parallel execution
- highly parallel
- sparse representation
- processing units
- computer architecture
- multi core architecture
- level parallelism
- real time
- decomposition algorithm
- decomposition method
- sparse data
- sparse coding
- multiscale
- gpu implementation
- parallel processing
- monte carlo
- tensor space
- general purpose