Login / Signup

FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency Trade-off in Language Model Inference.

Zirui LiuQingquan SongQiang Charles XiaoSathiya Keerthi SelvarajRahul MazumderAman GuptaXia Hu
Published in: CoRR (2024)
Keyphrases