Login / Signup
TA-MoE: Topology-Aware Large Scale Mixture-of-Expert Training.
Chang Chen
Min Li
Zhihua Wu
Dianhai Yu
Chao Yang
Published in:
NeurIPS (2022)
Keyphrases
</>
small scale
real world
topology preservation
training algorithm
training process
human experts
training examples
supervised learning
training set
case study
probability distribution
real life
feature selection
training phase
web scale
expert advice
neural network