Optimizing Distributed DNN Training Using CPUs and BlueField-2 DPUs.
Arpan JainNawras AlnaasanAamir ShafiHari SubramoniDhabaleswar K. PandaPublished in: IEEE Micro (2022)
Keyphrases
- training process
- distributed systems
- commodity hardware
- cooperative
- training set
- online learning
- fault tolerant
- distributed environment
- training samples
- database
- multi agent
- decision trees
- hidden markov models
- peer to peer
- virtual environment
- training data
- test set
- mobile agents
- metadata
- data sets
- distributed data
- distributed architecture
- distributed network
- real time