Language models scale reliably with over-training and on downstream tasks.
Samir Yitzhak GadreGeorgios SmyrnisVaishaal ShankarSuchin GururanganMitchell WortsmanRulin ShaoJean MercatAlex FangJeffrey LiSedrick KehRui XinMarianna NezhurinaIgor VasiljevicJenia JitsevAlexandros G. DimakisGabriel IlharcoShuran SongThomas KollarYair CarmonAchal DaveReinhard HeckelNiklas MuennighoffLudwig SchmidtPublished in: CoRR (2024)
Keyphrases
- language model
- language modeling
- n gram
- probabilistic model
- document retrieval
- statistical language models
- language modelling
- retrieval model
- information retrieval
- speech recognition
- test collection
- query expansion
- smoothing methods
- language models for information retrieval
- query terms
- context sensitive
- vector space model
- language model for information retrieval
- okapi bm
- ad hoc information retrieval
- pseudo relevance feedback
- translation model
- passage retrieval
- training set