Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism.
Jiahao LiuQifan WangJingang WangXunliang CaiPublished in: ACL (Findings) (2024)
Keyphrases
- gibbs sampler
- markov chain monte carlo
- sampling methods
- random sampling
- bayesian networks
- inference process
- sampling strategy
- probabilistic inference
- sampling algorithm
- metropolis hastings
- decoding algorithm
- exact inference
- inference engine
- monte carlo
- markov chain
- knowledge base
- decision theoretic
- artificial intelligence
- efficient learning
- sampling rate
- random fields
- memory efficient
- parameter space
- decision trees
- sampling strategies
- joint detection