Login / Signup

Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training.

Zexuan ZhongMengzhou XiaDanqi ChenMike Lewis
Published in: CoRR (2024)
Keyphrases