Cascading Biases: Investigating the Effect of Heuristic Annotation Strategies on Data and Models.
Chaitanya MalaviyaSudeep BhatiaMark YatskarPublished in: EMNLP (2022)
Keyphrases
- experimental data
- historical data
- data collection
- data sets
- raw data
- statistical methods
- statistical analysis
- prior knowledge
- image data
- small number
- data analysis
- database
- learning models
- original data
- sensor data
- training data
- missing data
- data mining tools
- data quality
- feature selection
- search algorithm
- high dimensional data
- computer systems
- image retrieval
- data mining techniques
- knowledge discovery
- end users