BlockLLM: Multi-tenant Finer-grained Serving for Large Language Models.
Jiamin LiLe XuHong XuAditya AkellaPublished in: CoRR (2024)
Keyphrases
- language model
- finer grained
- multi tenant
- data center
- language modeling
- object level
- n gram
- probabilistic model
- document retrieval
- retrieval model
- language modelling
- information retrieval
- test collection
- oracle database
- query expansion
- statistical language models
- smoothing methods
- mixture model
- cloud computing
- information retrieval systems
- language models for information retrieval
- database
- machine learning
- power consumption
- workflow engine
- energy consumption
- database systems