Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval.
João CoelhoBruno MartinsJoão MagalhãesJamie CallanChenyan XiongPublished in: CoRR (2024)
Keyphrases
- language model
- ad hoc information retrieval
- document retrieval
- query terms
- document ranking
- retrieval model
- information retrieval
- trec test collections
- test collection
- language modeling
- passage retrieval
- document length
- document level
- relevance model
- query expansion
- relevant documents
- n gram
- query specific
- inter document similarities
- language modeling approaches
- statistical language models
- vector space model
- probabilistic retrieval models
- pseudo feedback
- probabilistic model
- ir models
- retrieval effectiveness
- retrieved documents
- short queries
- relevance assessments
- information retrieval systems
- language modelling
- retrieval systems
- document representation
- retrieval process
- smoothing methods
- term dependencies
- cross language retrieval
- text retrieval
- term frequency
- language models for information retrieval
- trec collections
- pseudo relevance feedback
- language modeling framework
- document collections
- translation model
- okapi bm
- term weighting
- web retrieval
- structured documents
- average precision
- expert search
- expert finding
- original query
- machine translation
- question answering
- web search
- relevance feedback
- search engine