'Mutual Watch-dog Networking': Distributed Awareness of Faults and Critical Events in Petascale/Exascale systems.
Roberto AmmendolaAndrea BiagioniOttorino FrezzaFrancesca Lo CiceroAlessandro LonardoPier Stanislao PaolucciDavide RossettiFrancesco SimulaLaura TosorattoPiero ViciniPublished in: CoRR (2013)
Keyphrases
- distributed systems
- high performance computing
- scientific computing
- peer to peer
- distributed computing
- computing systems
- management system
- computer systems
- multi agent
- computing environments
- event detection
- event processing
- exchange information
- heterogeneous environments
- computer networks
- knowledge based systems
- intelligent systems
- response time
- expert systems