Online Diagnosis of Performance Variation in HPC Systems Using Machine Learning.
Ozan TuncerEmre AtesYijia ZhangAta TurkJim M. BrandtVitus J. LeungManuel EgeleAyse K. CoskunPublished in: IEEE Trans. Parallel Distributed Syst. (2019)
Keyphrases
- machine learning
- online learning
- model selection
- scientific computing
- medical diagnosis
- computer vision
- pattern recognition
- active learning
- distributed systems
- building blocks
- knowledge acquisition
- computer systems
- fault tolerant
- machine learning methods
- machine learning algorithms
- neural network
- discrete event systems
- text classification
- medical images
- text mining
- natural language
- artificial intelligence
- learning algorithm
- information retrieval