Recurrent Drafter for Fast Speculative Decoding in Large Language Models.
Aonan ZhangChong WangYi WangXuanyu ZhangYunfei ChengPublished in: CoRR (2024)
Keyphrases
- language model
- language modeling
- document retrieval
- speech recognition
- n gram
- probabilistic model
- query expansion
- retrieval model
- statistical language models
- information retrieval
- context sensitive
- pseudo relevance feedback
- query terms
- language modelling
- ad hoc information retrieval
- test collection
- smoothing methods
- text retrieval
- machine learning
- vector space model
- expert finding
- term dependencies
- word error rate
- relevance feedback
- spoken term detection
- language model for information retrieval