MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation.

Published in: NAACL-HLT (2022)

Keyphrases