Login / Signup
ElasticDL: A Kubernetes-native Deep Learning Framework with Fault-tolerance and Elastic Scheduling.
Jun Zhou
Ke Zhang
Feng Zhu
Qitao Shi
Wenjing Fang
Lin Wang
Yi Wang
Published in:
WSDM (2023)
Keyphrases
</>
fault tolerance
deep learning
fault tolerant
load balancing
distributed systems
database replication
distributed computing
data sets
pairwise
response time
machine learning
dimensionality reduction
mobile agents