NASPipe: high performance and reproducible pipeline parallel supernet training via causal synchronous parallelism.
Shixiong ZhaoFanxin LiXusheng ChenTianxiang ShenLi ChenSen WangNicholas ZhangCheng LiHeming CuiPublished in: ASPLOS (2022)
Keyphrases
- shared memory
- parallel computers
- parallel processing
- distributed memory
- parallel computing
- pc cluster
- data parallelism
- parallel computation
- parallel execution
- ibm sp
- parallel implementation
- massively parallel
- parallel architectures
- parallel architecture
- parallel programming
- array processor
- distributed memory machines
- computational power
- coarse grain
- training set
- fine grain
- online learning
- training phase
- supervised learning
- highly parallel
- level parallelism
- commodity hardware
- bayesian networks
- processing elements
- training examples
- training algorithm
- graphics processing units
- general purpose
- training process
- causal inference
- processing pipeline
- high efficiency
- coarse grained
- learning algorithm
- multicore processors
- neural network
- data sets