A tunable holistic resiliency approach for high-performance computing systems.
Stephen L. ScottChristian EngelmannGeoffroy ValléeThomas J. NaughtonAnand TikotekarGeorge OstrouchovChokchai LeangsuksunNichamon NaksinehaboonRaja NassarMihaela PaunFrank MuellerChao WangArun Babu NagarajanJyothish VarmaPublished in: PPOPP (2009)
Keyphrases
- distributed memory
- computing systems
- scientific computing
- computer systems
- parallel computers
- computing technologies
- autonomic computing
- high performance computing
- autonomic computing systems
- highly parallel
- high end
- computing platform
- processing units
- hardware platforms
- databases
- field programmable gate array
- parallel computing
- information technology
- case study
- information systems