On the Viability of Compression for Reducing the Overheads of Checkpoint/Restart-Based Fault Tolerance.
Dewan IbteshamDorian C. ArnoldPatrick G. BridgesKurt B. FerreiraRon BrightwellPublished in: ICPP (2012)
Keyphrases
- fault tolerance
- response time
- fault tolerant
- distributed systems
- load balancing
- high availability
- distributed computing
- group communication
- replicated databases
- random walk
- image compression
- database replication
- peer to peer
- mobile agents
- high performance computing
- single point of failure
- fault management
- high scalability
- failure recovery
- error detection
- grid computing
- end to end