A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale.
Hao-Jun Michael ShiTsung-Hsien LeeShintaro IwasakiJose Gallego-PosadaZhijing LiKaushik RangaduraiDheevatsa MudigereMichael RabbatPublished in: CoRR (2023)
Keyphrases
- distributed data
- neural network
- data sharing
- training process
- distributed data mining
- data distribution
- communication cost
- pattern recognition
- integrating heterogeneous
- file system
- distributed data sources
- databases
- data mining algorithms
- training set
- distributed systems
- shared memory
- data mining
- semantically heterogeneous
- management system
- preprocessing
- data structure