Lessons from the Trenches on Reproducible Evaluation of Language Models.
Stella BidermanHailey SchoelkopfLintang SutawikaLeo GaoJonathan TowBaber AbbasiAlham Fikri AjiPawan Sasanka AmmanamanchiSidney BlackJordan CliveAnthony DiPofiJulen EtxanizBenjamin FattoriJessica Zosa FordeCharles FosterJeffrey HsuMimansa JaiswalWilson Y. LeeHaonan LiCharles LoveringNiklas MuennighoffEllie PavlickJason PhangAviya SkowronSamson TanXiangru TangKevin A. WangGenta Indra WinataFrançois YvonAndy ZouPublished in: CoRR (2024)
Keyphrases
- language model
- language modeling
- n gram
- probabilistic model
- speech recognition
- retrieval model
- statistical language models
- test collection
- query expansion
- smoothing methods
- document retrieval
- language modelling
- context sensitive
- information retrieval
- vector space model
- language models for information retrieval
- ad hoc information retrieval
- query terms
- pseudo relevance feedback
- cross lingual
- document ranking
- word error rate
- translation model
- spoken term detection