The FTMPS-Project: Design and Implementation of Fault-Tolerance Techniques for Massively Parallel Systems.
Johan VounckxGeert DeconinckRudy LauwereinsG. ViehöverR. WagnerHenrique MadeiraJoão Gabriel SilvaFrank BalbachJörn AltmannBernd BiekerHarald WillekePublished in: HPCN (1994)
Keyphrases
- fault tolerance
- massively parallel
- fault tolerant
- distributed systems
- high performance computing
- fault management
- case study
- single point of failure
- distributed computing
- knowledge based systems
- fine grained
- load balancing
- computer systems
- database replication
- peer to peer
- response time
- expert systems
- graphics processing units
- parallel computers
- hardware architecture
- group communication
- lower bound
- artificial intelligence
- replicated databases
- platform independent
- computer architecture
- software development
- intelligent systems