Sign in

Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache.

Bin LinTao PengChen ZhangMinmin SunLanbo LiHanyu ZhaoWencong XiaoQi XuXiafei QiuShen LiZhigang JiYong LiWei Lin
Published in: CoRR (2024)
Keyphrases