CoLoR: Co-Located Rescuers for Fault Tolerance in HPC Systems.
Zaeem HussainXiaolong CuiTaieb ZnatiRami G. MelhemPublished in: ICPADS (2018)
Keyphrases
- data sets
- fault tolerance
- distributed systems
- fault tolerant
- fault management
- single point of failure
- distributed computing
- load balancing
- high performance computing
- response time
- high availability
- high scalability
- mobile agents
- group communication
- replicated databases
- peer to peer
- database replication
- color images
- error detection
- computer systems
- computing systems
- database systems