Collective operations in application-level fault-tolerant MPI.
Greg BronevetskyDaniel MarquesKeshav PingaliPaul StodghillPublished in: ICS (2003)
Keyphrases
- fault tolerant
- application level
- fault tolerance
- distributed systems
- operating system
- network management
- network services
- message passing
- high performance computing
- quality of service
- load balancing
- virtual machine
- parallel algorithm
- parallel implementation
- low cost
- mobile agent system
- mobile agents
- overlay network
- databases
- safety critical
- bottle neck
- parallel computing
- shared memory
- computer systems
- response time