Performance Evaluation of an Algorithm-based Asynchronous Checkpoint-Restart Fault Tolerant Application Using Mixed MPI/GPI-2.
Adrian BazagaMichal PitonákPublished in: CoRR (2018)
Keyphrases
- fault tolerant
- fault tolerance
- computational complexity
- dynamic programming
- learning algorithm
- objective function
- k means
- parallel implementation
- load balancing
- expectation maximization
- parallelization strategy
- interconnection networks
- high performance computing
- detection algorithm
- distributed systems
- search space
- optimal solution
- artificial intelligence