SafeMPI - Extending MPI for Byzantine Error Detection on Parallel Clusters
Dmitry MogilevskySean KellerPublished in: CoRR (2005)
Keyphrases
- error detection
- parallel implementation
- shared memory
- error correction
- message passing interface
- fault tolerant
- fault tolerance
- parallelization strategy
- parallel programming
- error recovery
- parallel computing
- high performance computing
- massively parallel
- distributed memory
- clustering algorithm
- data cleansing
- parallel algorithm
- error correcting
- message passing
- general purpose
- fault isolation
- parallel computation
- hierarchical clustering
- error resilient
- parallel computers
- cluster analysis
- computer architecture
- parallel execution
- distributed systems
- data collection
- error control
- parallel processing
- parallel machines