Login / Signup
Failure detection and propagation in HPC systems.
George Bosilca
Aurélien Bouteiller
Amina Guermouche
Thomas Hérault
Yves Robert
Pierre Sens
Jack J. Dongarra
Published in:
SC (2016)
Keyphrases
</>
failure detection
knowledge based systems
artificial neural networks
distributed systems
complex systems
scientific computing
database
data sets
digital libraries
software development
building blocks
learning systems
retrieval systems
high performance computing