The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems.
Derrick KondoBahman JavadiAlexandru IosupDick H. J. EpemaPublished in: CCGRID (2010)
Keyphrases
- distributed systems
- comparative analysis
- failure recovery
- fault tolerance
- failure detection
- root cause
- failure rate
- node failures
- fault tolerant
- data replication
- load balancing
- component failures
- repair actions
- failure modes
- distributed environment
- geographically distributed
- message passing
- mobile agents
- distributed database systems
- distributed computing
- loosely coupled
- metadata
- deadlock detection
- operating system
- semi quantitative
- software development environments
- agent based systems
- agent technology
- dynamic environments
- case study