DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale.
Samyam RajbhandariConglong LiZhewei YaoMinjia ZhangReza Yazdani AminabadiAmmar Ahmad AwanJeff RasleyYuxiong HePublished in: ICML (2022)
Keyphrases
- artificial intelligence
- structured prediction
- case based reasoning
- power consumption
- mixture model
- web intelligence
- test set
- training phase
- machine learning
- training set
- expert systems
- probabilistic inference
- intelligent systems
- supervised learning
- ai systems
- domain experts
- bayesian networks
- bayesian model
- knowledge base
- dirichlet process
- scale space
- inference process
- exponential family
- dynamic bayesian networks
- training process
- hidden markov models
- training samples