Login / Signup

SEER-MoE: Sparse Expert Efficiency through Regularization for Mixture-of-Experts.

Alexandre MuzioAlex SunChuran He
Published in: CoRR (2024)
Keyphrases