Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models.

Published in: CoRR (2024)

Keyphrases