Targeting the Source: Selective Data Curation for Debiasing NLP Models.
Yacine GaciBoualem BenatallahFabio CasatiKhalid BenabdeslemPublished in: ECML/PKDD (2) (2023)
Keyphrases
- data sets
- experimental data
- high quality
- historical data
- statistical methods
- prior knowledge
- training data
- data structure
- data processing
- image data
- hidden variables
- database
- data distribution
- text mining
- natural language processing
- small number
- knowledge discovery
- data analysis
- data mining techniques
- data points
- high dimensional data
- synthetic data
- background knowledge
- sensor data
- end users
- clustering algorithm
- information retrieval
- data mining