TensorOpt: Exploring the Tradeoffs in Distributed DNN Training With Auto-Parallelism.
Zhenkun CaiXiao YanKaihao MaYidi WuYuzhen HuangJames ChengTeng SuFan YuPublished in: IEEE Trans. Parallel Distributed Syst. (2022)
Keyphrases
- training process
- distributed systems
- training set
- multi agent
- cooperative
- parallel execution
- parallel processing
- lightweight
- test set
- shared memory
- communication overhead
- training phase
- parallel computing
- mobile agents
- fault tolerant
- distributed data
- commodity hardware
- distributed network
- database
- data flow
- training samples
- online learning
- peer to peer
- query processing
- artificial neural networks
- search algorithm
- databases