Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache.
Bin LinTao PengChen ZhangMinmin SunLanbo LiHanyu ZhaoWencong XiaoQi XuXiafei QiuShen LiZhigang JiYong LiWei LinPublished in: CoRR (2024)
Keyphrases
- management system
- cost effective
- cooperative
- contextual information
- multi agent
- context aware
- computing environments
- database
- peer to peer
- lightweight
- computationally efficient
- mission critical
- global knowledge
- configuration management
- fully distributed
- loosely coupled
- service quality
- computer networks
- distributed environment
- computationally expensive
- distributed systems
- neural network