Hardware thread reordering to boost OpenCL throughput on FPGAs.
Amir MomeniHamed TabkhiGunar SchirnerDavid R. KaeliPublished in: ICCD (2016)
Keyphrases
- field programmable gate array
- parallel architectures
- hardware implementation
- hardware software
- embedded systems
- parallel computing
- low cost
- hardware design
- hardware and software
- hardware architecture
- programmable logic
- image processing algorithms
- response time
- fpga implementation
- clock frequency
- graphics processing units
- reconfigurable hardware
- massively parallel
- low latency
- real time
- shared memory
- computing systems
- fpga technology
- computer systems
- digital signal processing
- transactional memory
- high end
- parallel processing
- image processing
- vlsi implementation
- parallel algorithm
- congestion control
- higher throughput