A General Methodology to Quantify Biases in Natural Language Data.
Jiawei ChenAnbang XuZhe LiuYufan GuoXiaotong LiuYingbei TongRama AkkirajuJohn M. CarrollPublished in: CHI Extended Abstracts (2020)
Keyphrases
- data sets
- database
- data collection
- raw data
- data structure
- special case
- synthetic data
- natural language
- data analysis
- probability distribution
- small number
- original data
- temporal information
- multimedia data
- application domains
- background knowledge
- input data
- data mining techniques
- high quality
- information systems
- statistical analysis
- end users
- prior knowledge
- training data
- learning algorithm
- machine learning
- noisy data
- data quality
- complex data