What Language Model to Train if You Have One Million GPU Hours?
Teven Le ScaoThomas WangDaniel HesslowStas BekmanM. Saiful BariStella BidermanHady ElsaharNiklas MuennighoffJason PhangOfir PressColin RaffelVictor SanhSheng ShenLintang SutawikaJaesung TaeZheng Xin YongJulien LaunayIz BeltagyPublished in: EMNLP (Findings) (2022)
Keyphrases
- language model
- language modeling
- n gram
- document retrieval
- probabilistic model
- speech recognition
- query expansion
- language modelling
- information retrieval
- retrieval model
- context sensitive
- document ranking
- query terms
- test collection
- mixture model
- ad hoc information retrieval
- relevance model
- smoothing methods
- word error rate
- translation model
- query specific
- document length