On the Performance and Memory Footprint of Distributed Training: An Empirical Study on Transformers.
Zhengxian LuFangyu WangZhiwei XuFei YangTao LiPublished in: CoRR (2024)
Keyphrases
- memory footprint
- memory usage
- distributed environment
- training process
- training set
- cooperative
- multi agent
- distributed systems
- lightweight
- test set
- fault tolerant
- training phase
- distributed processing
- supervised learning
- distributed network
- peer to peer
- database
- decision trees
- computer networks
- communication cost
- training examples
- distributed learning
- memory requirements
- response time
- small number
- search algorithm
- training data
- databases