Login / Signup
Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection.
Suchin Gururangan
Dallas Card
Sarah K. Dreier
Emily K. Gade
Leroy Z. Wang
Zeyu Wang
Luke Zettlemoyer
Noah A. Smith
Published in:
CoRR (2022)
Keyphrases
</>
text data
high quality
natural language
data sets
text classification
databases
information retrieval
search engine
computer vision
prior knowledge
image data
text mining
co occurrence
language model
document collections