Login / Signup

SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices.

Ruslan SvirschevskiAvner MayZhuoming ChenBeidi ChenZhihao JiaMax Ryabinin
Published in: CoRR (2024)
Keyphrases