Online failure prediction for HPC resources using decentralized clustering.
Alejandro PelaezAndres QuirozJames C. BrowneEdward ChuahManish ParasharPublished in: HiPC (2014)
Keyphrases
- failure prediction
- clustering algorithm
- online learning
- clustering method
- online resources
- k means
- limited resources
- hierarchical clustering
- cooperative
- cluster analysis
- resource management
- high performance computing
- computing resources
- multi agent
- networked environment
- resource allocation
- distributed systems
- civil engineering
- knowledge base
- coordination mechanism
- internet enabled