Automated application-level checkpointing of MPI programs.
Greg BronevetskyDaniel MarquesKeshav PingaliPaul StodghillPublished in: PPOPP (2003)
Keyphrases
- application level
- operating system
- network management
- quality of service
- network services
- virtual machine
- distributed databases
- overlay network
- fault tolerance
- main memory databases
- bottle neck
- message passing
- parallel algorithm
- distributed systems
- fault tolerant
- parallel implementation
- database systems
- failure recovery
- high performance computing
- social networks
- massively parallel
- distributed database systems
- distributed environment
- parallelization strategy