Sign in

HiRE: High Recall Approximate Top-k Estimation for Efficient LLM Inference.

Yashas SamagaVarun YerramChong YouSrinadh BhojanapalliSanjiv KumarPrateek JainPraneeth Netrapalli
Published in: CoRR (2024)
Keyphrases