FAIL-MPI: How Fault-Tolerant Is Fault-Tolerant MPI?
William HoarauPierre LemarinierThomas HéraultEric RodriguezSébastien TixeuilFranck CappelloPublished in: CLUSTER (2006)
Keyphrases
- fault tolerant
- fault tolerance
- high performance computing
- distributed systems
- message passing
- parallel implementation
- parallel algorithm
- shared memory
- parallelization strategy
- message passing interface
- load balancing
- state machine
- parallel computing
- massively parallel
- interconnection networks
- high availability
- safety critical
- parallel programming
- index structure
- parallel computers