Taming Resource Heterogeneity In Distributed ML Training With Dynamic Batching.
Sahil TyagiPrateek SharmaPublished in: CoRR (2023)
Keyphrases
- distributed systems
- maximum likelihood
- dynamic environments
- multi agent
- lightweight
- resource sharing
- computer networks
- scheduling problem
- training process
- resource management
- single machine
- dynamic resource allocation
- distributed environment
- resource allocation
- training examples
- online learning
- data management
- virtual organization
- machine learning