In-Context Pretraining: Language Modeling Beyond Document Boundaries.
Weijia ShiSewon MinMaria LomeliChunting ZhouMargaret LiXi Victoria LinNoah A. SmithLuke ZettlemoyerWen-tau YihMike LewisPublished in: ICLR (2024)
Keyphrases
- language modeling
- language model
- information retrieval
- retrieval model
- document length
- improvements in retrieval effectiveness
- document retrieval
- query expansion
- language modeling approaches
- information retrieval systems
- n gram
- context sensitive
- language modeling framework
- cross lingual
- relevance model
- retrieval systems
- pseudo feedback
- vector space model
- document language models
- probabilistic model
- term weighting schemes
- term weighting
- document ranking
- text classification
- document representation
- query terms
- relevant documents
- feature selection
- document collections
- query specific
- machine learning
- statistical language modeling
- term dependencies
- ad hoc information retrieval
- metadata
- multimedia
- user queries
- sentence retrieval
- tf idf
- context dependent