Login / Signup

LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism.

Bingyang WuShengyu LiuYinmin ZhongPeng SunXuanzhe LiuXin Jin
Published in: CoRR (2024)
Keyphrases