Login / Signup

QuRating: Selecting High-Quality Data for Training Language Models.

Alexander WettigAatmik GuptaSaumya MalikDanqi Chen
Published in: CoRR (2024)
Keyphrases
  • high quality
  • language model
  • language modeling
  • query expansion
  • decision trees
  • training data
  • knowledge discovery
  • information retrieval systems
  • ad hoc information retrieval
  • information retrieval