Login / Signup
CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM Inference.
Suyi Li
Hanfeng Lu
Tianyuan Wu
Minchen Yu
Qizhen Weng
Xusheng Chen
Yizhou Shan
Binhang Yuan
Wei Wang
Published in:
CoRR (2024)
Keyphrases
</>
agent technology
probabilistic inference
generative model
data driven
bayesian inference
markov logic networks
multi agent systems
information retrieval
probabilistic reasoning
rank order
highly ranked
machine learning
unsupervised learning
inference engine
structured prediction
inference process