Application Level Fault Recovery: Using Fault-Tolerant Open MPI in a PDE Solver.
Md. Mohsin AliJames SouthernPeter E. StrazdinsBrendan HardingPublished in: IPDPS Workshops (2014)
Keyphrases
- fault tolerant
- application level
- fault tolerance
- fault isolation
- distributed systems
- operating system
- error detection
- network management
- quality of service
- partial differential equations
- message passing
- fault diagnosis
- fault detection
- load balancing
- virtual machine
- parallel algorithm
- overlay network
- state machine
- high availability
- bottle neck
- parallel implementation
- safety critical
- massively parallel
- mobile agents
- physical systems
- interconnection networks
- wireless networks