DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale.
Samyam RajbhandariConglong LiZhewei YaoMinjia ZhangReza Yazdani AminabadiAmmar Ahmad AwanJeff RasleyYuxiong HePublished in: CoRR (2022)
Keyphrases
- artificial intelligence
- structured prediction
- scale space
- probabilistic inference
- training set
- expert systems
- training process
- training samples
- web intelligence
- power consumption
- mixture model
- test set
- ai technologies
- machine learning
- small scale
- training examples
- knowledge representation
- information processing
- online learning
- belief networks
- computational intelligence
- case based reasoning
- support vector machine
- training algorithm
- active learning
- ai systems
- computer software
- inference mechanism
- bayesian networks