Login / Signup

Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism.

Jiahao LiuQifan WangJingang WangXunliang Cai
Published in: CoRR (2024)
Keyphrases