MPICH-V: toward a scalable fault tolerant MPI for volatile nodes.
George BosilcaAurélien BouteillerFranck CappelloSamir DjilaliGilles FedakCécile GermainThomas HéraultPierre LemarinierOleg LodygenskyFrédéric MagnietteVincent NériAnton SelikhovPublished in: SC (2002)
Keyphrases
- fault tolerant
- fault tolerance
- interconnection networks
- distributed systems
- message passing
- load balancing
- directed graph
- parallel implementation
- state machine
- network structure
- parallel algorithm
- shared memory
- high availability
- parallel computing
- high performance computing
- message passing interface
- database
- shortest path