A Cost-Efficient Failure-Tolerant Scheme for Distributed DNN Training.
Menglei ChenYu HuaRong BaiJianming HuangPublished in: ICCD (2023)
Keyphrases
- cost efficient
- training process
- distributed systems
- training set
- distributed environment
- training algorithm
- failure prediction
- text classification
- online learning
- fault tolerant
- training phase
- rights management
- neural network
- governmental organizations
- learning scheme
- classification scheme
- computing environments
- training samples
- peer to peer
- feature space
- cooperative