Using hardware multithreading to overcome broadcast/reduction latency in an associative SIMD processor.
Kevin SchafferRobert A. WalkerPublished in: IPDPS (2008)
Keyphrases
- multithreading
- parallel computing
- computational power
- massively parallel
- power reduction
- highly efficient
- parallel architectures
- multi core processors
- distributed memory
- wireless broadcast
- parallel processing
- low latency
- shared memory
- parallel algorithm
- single instruction multiple data
- parallel implementation
- coarse grained
- memory efficient
- data partitioning
- computer architecture
- computing systems
- memory bandwidth
- response time
- parallel computers
- message passing
- instruction scheduling
- highly scalable
- prefetching
- memory requirements
- low cost