Login / Signup

Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference.

Jinghan YaoQuentin AnthonyAamir ShafiHari SubramoniDhabaleswar K. Panda
Published in: IPDPS (2024)
Keyphrases
  • probabilistic model
  • learning algorithm