Evaluating the reliability of a GPU pipeline to SEU and the impacts of software-based and hardware-based fault tolerance techniques.
Marcio GonçalvesMateus SaquettiJosé Rodrigo AzambujaPublished in: Microelectron. Reliab. (2018)
Keyphrases
- fault tolerance
- error detection
- fault tolerant
- real time
- graphics hardware
- load balancing
- distributed computing
- computer systems
- distributed systems
- high availability
- hardware design
- peer to peer
- response time
- replicated databases
- graphics processors
- group communication
- graphics processing units
- mobile agents
- database replication
- pipeline architecture
- fault management
- software systems
- blue gene
- embedded systems
- parallel implementation
- failure recovery
- computing systems
- databases
- parallel computation
- massively parallel
- platform independent
- parallel computing
- error correction
- software components
- parallel processing
- data management
- single point of failure
- expert systems