Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study.
Boxin WangWei PingPeng XuLawrence McAfeeZihan LiuMohammad ShoeybiYi DongOleksii KuchaievBo LiChaowei XiaoAnima AnandkumarBryan CatanzaroPublished in: EMNLP (2023)
Keyphrases
- language model
- autoregressive
- retrieval model
- document retrieval
- query expansion
- test collection
- language models for information retrieval
- ad hoc information retrieval
- information retrieval
- language modeling
- statistical language models
- document ranking
- probabilistic model
- n gram
- non stationary
- relevance model
- random fields
- query terms
- gaussian markov random field
- document length
- smoothing methods
- speech recognition
- passage retrieval
- query specific
- language modelling
- pseudo relevance feedback
- vector space model
- term dependencies
- okapi bm
- text retrieval
- statistical language modeling
- sar images
- retrieval accuracy
- ad hoc retrieval
- retrieval effectiveness
- machine learning
- image retrieval
- relevance feedback
- retrieval systems
- information retrieval systems
- term frequency
- tf idf
- vector space
- bayesian networks
- document collections