Achieving low-overhead fault tolerance for parallel accelerators with dynamic partial reconfiguration.
James J. DavisPeter Y. K. CheungPublished in: FPL (2014)
Keyphrases
- massively parallel
- fault tolerance
- low overhead
- load balancing
- fault tolerant
- high performance computing
- distributed systems
- distributed computing
- response time
- peer to peer
- shared memory
- database replication
- group communication
- mobile agents
- high reliability
- multi dimensional
- replicated databases
- fault management
- single point of failure
- failure recovery
- component failures
- energy efficient
- message passing