SmartMoE: Efficiently Training Sparsely-Activated Models through Combining Offline and Online Parallelization.
Mingshu ZhaiJiaao HeZixuan MaZan ZongRunqing ZhangJidong ZhaiPublished in: USENIX Annual Technical Conference (2023)
Keyphrases
- online learning
- bayesian networks
- training set
- prior knowledge
- probabilistic model
- statistical model
- real time
- statistical models
- structured prediction
- linear model
- training process
- classification models
- sparse representation
- machine learning algorithms
- least squares
- artificial neural networks
- face recognition
- information systems