Supporting User-directed Fault Tolerance over Standard MPI.
Zhimin WuRui WangWeizhi XuMingyu ChenErlin YaoPublished in: ICPADS (2012)
Keyphrases
- fault tolerance
- fault tolerant
- high performance computing
- distributed systems
- high availability
- load balancing
- distributed computing
- group communication
- response time
- peer to peer
- database replication
- fault management
- mobile agents
- replicated databases
- single point of failure
- high scalability
- wireless sensor
- data replication
- parallel computing
- parallel implementation
- data sets