Fault Tolerance in Iterative-Convergent Machine Learning.
Aurick QiaoBryon AragamBingjing ZhangEric P. XingPublished in: ICML (2019)
Keyphrases
- fault tolerance
- machine learning
- fault tolerant
- distributed systems
- distributed computing
- load balancing
- response time
- high availability
- peer to peer
- replicated databases
- mobile agents
- group communication
- fault management
- knowledge acquisition
- high scalability
- high performance computing
- database replication
- knowledge representation
- data replication
- data mining
- failure recovery
- single point of failure
- distributed query processing
- error detection
- reinforcement learning
- databases
- cooperative
- artificial intelligence
- replica control