Using Dynamic Task Level Redundancy for OpenMP Fault Tolerance.
Oussama TahanMohamed ShawkyPublished in: ARCS (2012)
Keyphrases
- fault tolerance
- fault tolerant
- high performance computing
- distributed systems
- load balancing
- distributed computing
- response time
- high availability
- fault management
- peer to peer
- database replication
- group communication
- failure recovery
- replicated databases
- high scalability
- data replication
- single point of failure
- artificial intelligence
- replica control
- mobile agent system
- mobile agents
- data collection
- multi agent
- reinforcement learning