A Fast-Start, Fault-Tolerant MPI Launcher on Dawning Supercomputers.
Xu LiuBibo TuJianfeng ZhanDan MengPublished in: PDCAT (2008)
Keyphrases
- fault tolerant
- high performance computing
- fault tolerance
- shared memory
- parallel computing
- message passing interface
- massively parallel
- parallel programming
- distributed systems
- load balancing
- distributed computing
- message passing
- distributed memory
- parallel algorithm
- high availability
- computer architecture
- state machine
- safety critical
- artificial intelligence
- parallel computers
- computing systems
- energy efficiency
- metadata
- computing resources
- error detection
- parallel machines
- general purpose
- parallel processing