FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models.
Saeed RashidiWilliam WonSudarshan SrinivasanPuneet GuptaTushar KrishnaPublished in: CoRR (2024)
Keyphrases
- distributed systems
- training process
- distributed computation
- communication overhead
- high speed
- lightweight
- complex systems
- fully distributed
- statistical models
- distributed network
- neural network
- communication cost
- communication networks
- interprocess communication
- local area network
- multi party
- distributed environment
- data distribution
- supervised learning