Login / Signup

Towards Pareto Optimal Throughput in Small Language Model Serving.

Pol G. RecasensYue ZhuChen WangEun Kyung LeeOlivier TardieuAlaa YoussefJordi TorresJosep Lluís Berral
Published in: CoRR (2024)
Keyphrases