Login / Signup

Inference acceleration for large language models using "stairs" assisted greedy generation.

Domas GrigaliunasMantas Lukosevicius
Published in: CoRR (2024)
Keyphrases