LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference.

Published in: CoRR (2024)

Keyphrases