Hybrid Checkpointing for MPI Jobs in HPC Environments.
Chao WangFrank MuellerChristian EngelmannStephen L. ScottPublished in: ICPADS (2010)
Keyphrases
- high performance computing
- fault tolerance
- message passing interface
- scientific computing
- distributed databases
- computing environments
- parallel algorithm
- processing times
- dynamic environments
- distributed systems
- parallel computing
- massively parallel
- grid computing
- parallel machines
- fault tolerant
- general purpose
- message passing
- shared memory
- real world
- parallelization strategy
- distributed memory
- robotic systems
- information technology