TA-MoE: Topology-Aware Large Scale Mixture-of-Expert Training.

Chang Chen Min Li Zhihua Wu Dianhai Yu Chao Yang

Published in: CoRR (2023)

Keyphrases

small scale
knowledge base
supervised learning
mixture model
real world
artificial intelligence
domain experts
training phase
web scale
subject matter experts
multiscale
training algorithm
computer software
classifier training
labelled data