SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification.

Published in: CoRR (2023)

Keyphrases