Login / Signup
TCRA-LLM: Token Compression Retrieval Augmented Large Language Model for Inference Cost Reduction.
Junyi Liu
Liangzhi Li
Tong Xiang
Bowen Wang
Yiming Qian
Published in:
EMNLP (Findings) (2023)
Keyphrases
</>
language model
cost reduction
retrieval model
document retrieval
ad hoc information retrieval
test collection
information retrieval
query expansion
language modeling
language models for information retrieval
n gram
query terms
statistical language models
cross language retrieval
document length
probabilistic model
query specific
jelinek mercer
document ranking
web page retrieval
language modelling
smoothing methods
speech recognition
relevance model
text retrieval
retrieval effectiveness
pseudo feedback
cost savings
retrieval systems
term dependencies
statistical language modeling
mixture model
okapi bm
context sensitive
average precision
vector space model
information retrieval systems
term weighting
relevance feedback
image retrieval
pseudo relevance feedback
lead time
relevant documents
bayesian networks
ad hoc retrieval
retrieval accuracy
term frequency
vector space
inter document similarities
optimal solution