Towards Pareto Optimal Throughput in Small Language Model Serving.
Pol G. RecasensYue ZhuChen WangEun Kyung LeeOlivier TardieuAlaa YoussefJordi TorresJosep Lluís BerralPublished in: CoRR (2024)
Keyphrases
- language model
- pareto optimal
- language modeling
- multi objective
- n gram
- probabilistic model
- document retrieval
- information retrieval
- multi objective optimization
- multiple objectives
- speech recognition
- language modelling
- mixture model
- retrieval model
- statistical language models
- query expansion
- context sensitive
- nash equilibrium
- ad hoc information retrieval
- test collection
- smoothing methods
- query terms
- evolutionary algorithm
- translation model
- language model for information retrieval
- nsga ii
- pseudo relevance feedback
- optimal solution
- genetic algorithm
- relevance model
- cross lingual
- vector space