Parallel sparse LU decomposition using FPGA with an efficient cache architecture.
Xiang GeHengliang ZhuFan YangLingli WangXuan ZengPublished in: ASICON (2017)
Keyphrases
- pipelined architecture
- parallel architecture
- hardware implementation
- systolic array
- multithreading
- parallel hardware
- hardware architecture
- real time
- software implementation
- field programmable gate array
- level parallelism
- master slave
- parallel implementation
- multi processor
- distributed processing
- fpga technology
- shared memory
- hardware design
- fpga implementation
- management system
- tensor decomposition
- hardware architectures
- parallel processing
- processing elements
- dedicated hardware
- high speed
- query processing
- parallel computing
- low cost
- multiprocessor systems
- memory hierarchy
- data flow
- processing units
- verilog hdl
- reconfigurable hardware
- caching scheme
- distributed memory
- highly efficient
- signal processing
- high dimensional