Minions: Accelerating Large Language Model Inference with Adaptive and Collective Speculative Decoding.

Published in: CoRR (2024)

Keyphrases