Login / Signup

MemServe: Context Caching for Disaggregated LLM Serving with Elastic Memory Pool.

Cunchen HuHeyang HuangJunhao HuJiang XuXusheng ChenTao XieChenxi WangSa WangYungang BaoNinghui SunYizhou Shan
Published in: CoRR (2024)
Keyphrases
  • contextual information
  • neural network
  • main memory
  • memory requirements
  • context sensitive
  • associative memory
  • information systems
  • case study
  • query processing
  • peer to peer
  • context aware
  • memory usage