Implementation and Evaluation of NAS Parallel CG Benchmark on GPU Cluster with Proprietary Interconnect TCA.
Kazuya MatsumotoNorihisa FujitaToshihiro HanawaTaisuke BokuPublished in: VECPAR (2016)
Keyphrases
- parallel implementation
- graphics processing units
- cluster of workstations
- shared memory
- parallel computing
- parallel programming
- parallel processing
- real time
- parallel computation
- graphics cards
- compute unified device architecture
- gpu implementation
- efficient implementation
- parallel computers
- general purpose
- parallel architecture
- real world
- massively parallel
- highly parallel
- evaluation method
- message passing
- high speed
- computer architecture
- graphics hardware
- floating point
- parallel architectures
- graphics processors
- shared memory multiprocessor
- bit parallel
- open source
- graphic processing unit