Login / Signup
MemServe: Context Caching for Disaggregated LLM Serving with Elastic Memory Pool.
Cunchen Hu
Heyang Huang
Junhao Hu
Jiang Xu
Xusheng Chen
Tao Xie
Chenxi Wang
Sa Wang
Yungang Bao
Ninghui Sun
Yizhou Shan
Published in:
CoRR (2024)
Keyphrases
</>
contextual information
neural network
main memory
memory requirements
context sensitive
associative memory
information systems
case study
query processing
peer to peer
context aware
memory usage