Proactive Fault Tolerance in MPI Applications Via Task Migration.
Sayantan ChakravortyCelso L. MendesLaxmikant V. KaléPublished in: HiPC (2006)
Keyphrases
- fault tolerance
- high performance computing
- fault tolerant
- distributed systems
- load balancing
- distributed computing
- message passing
- response time
- parallel algorithm
- group communication
- replicated databases
- parallel implementation
- mobile agents
- peer to peer
- high availability
- parallel computing
- database replication
- shared memory
- data replication
- high scalability
- grid computing
- multi agent
- database systems
- error detection
- multimedia
- knowledge base
- failure recovery
- fault management
- databases