PartRePer-MPI: Combining Fault Tolerance and Performance for MPI Applications.
Sarthak JoshiSathish VadhiyarPublished in: CoRR (2023)
Keyphrases
- fault tolerance
- high performance computing
- fault tolerant
- message passing interface
- message passing
- distributed systems
- response time
- load balancing
- parallel implementation
- parallel algorithm
- high availability
- peer to peer
- distributed computing
- shared memory
- replicated databases
- fault management
- mobile agents
- database replication
- parallel computing
- data replication
- databases
- error detection
- group communication
- single point of failure