Login / Signup

MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression.

Tianyu FuHaofeng HuangXuefei NingGenghan ZhangBoju ChenTianqi WuHongyi WangZixiao HuangShiyao LiShengen YanGuohao DaiHuazhong YangYu Wang
Published in: CoRR (2024)
Keyphrases