Login / Signup

Efficient LLM Training and Serving with Heterogeneous Context Sharding among Attention Heads.

Xihui LinYunan ZhangSuyu GeBarun PatraVishrav ChaudharyXia Song
Published in: CoRR (2024)
Keyphrases