A Fault Tolerance Manager with Distributed Coordinated Checkpoints for Automatic Recovery.
Jorge VillamayorDolores RexachsEmilio LuquePublished in: HPCS (2017)
Keyphrases
- fault tolerance
- fault tolerant
- distributed systems
- failure recovery
- distributed computing
- error detection
- mobile agents
- group communication
- peer to peer
- single point of failure
- database replication
- distributed query processing
- load balancing
- high availability
- fault management
- cooperative
- high scalability
- distributed environment
- multi agent
- response time
- replicated databases
- node failures
- data replication
- grid computing
- mobile agent system
- management system
- knowledge base
- distributed database systems
- database