A Checkpointing/Recovery System for MPI Applications on Cluster of IA-64 Computers.
Youhui ZhangRuini XueDongsheng WongWeimin ZhengPublished in: ICPP Workshops (2005)
Keyphrases
- failure recovery
- main memory databases
- clustering algorithm
- distributed systems
- general purpose
- parallel algorithm
- message passing
- computer technology
- parallel implementation
- distributed database systems
- hierarchical clustering
- load balancing
- distributed databases
- single link
- data clustering
- fault tolerance
- distributed memory
- data sets
- hierarchical structure
- personal computer
- subspace clustering
- shared memory
- computer systems
- turing test
- probabilistic model
- recovery algorithm