Scaling Retrieval-Based Language Models with a Trillion-Token Datastore.
Rulin ShaoJacqueline HeAkari AsaiWeijia ShiTim DettmersSewon MinLuke ZettlemoyerPang Wei KohPublished in: CoRR (2024)
Keyphrases
- language model
- retrieval model
- document retrieval
- test collection
- language models for information retrieval
- query expansion
- ad hoc information retrieval
- information retrieval
- language modeling
- statistical language models
- document length
- smoothing methods
- relevance model
- cross language retrieval
- document ranking
- probabilistic model
- query terms
- n gram
- passage retrieval
- speech recognition
- vector space model
- query generation
- document level
- retrieval effectiveness
- retrieval framework
- okapi bm
- text retrieval
- language modelling
- query specific
- ir models
- statistical language modeling
- language model for information retrieval
- term dependencies
- pseudo relevance feedback
- average precision
- context sensitive
- retrieval systems
- information retrieval systems
- cross lingual
- pseudo feedback
- term frequency
- improve retrieval effectiveness
- term proximity
- language modeling approaches
- retrieval accuracy
- sentence retrieval
- out of vocabulary
- term weights
- retrieval process
- tf idf
- image database
- co occurrence
- text mining
- image retrieval