Improving Automatic Parallel Training via Balanced Memory Workload Optimization.
Yujie WangYouhe JiangXupeng MiaoFangcheng FuXiaonan NieBin CuiPublished in: CoRR (2023)
Keyphrases
- optimization algorithm
- training process
- parallel hardware
- global optimization
- data sets
- multi threaded
- random access
- memory usage
- semi automatic
- optimization method
- distributed shared memory
- stochastic gradient descent
- processing elements
- optimization process
- parallel processing
- fully automatic
- response time
- training set
- neural network
- test set
- associative memory
- training examples
- online learning
- massively parallel
- optimization problems
- memory space
- supervised learning
- parallel execution
- feature space
- level parallelism
- genetic algorithm