Accelerating LLM Inference by Enabling Intermediate Layer Decoding.

Neeraj Varshney Agneet Chatterjee Mihir Parmar Chitta Baral

Published in: CoRR (2023)

Keyphrases

inference process
bayesian networks
multi layer
efficient learning
grammatical inference
application layer
decoding algorithm
case study
bayesian inference
inference engine
inference mechanism
databases
artificial neural networks
probabilistic inference
decision theoretic
decoding process