A Mixed-Method Design Approach for Empirically Based Selection of Unbiased Data Annotators.
Gautam S. ThakurJanna CaspersenDrahomira HerrmannovaBryan EatonJordan BurdettePublished in: ACL/IJCNLP (Findings) (2021)
Keyphrases
- synthetic data
- input data
- data sets
- statistical methods
- test data
- missing data
- high accuracy
- cost function
- noisy data
- data processing
- gold standard
- missing values
- original data
- prior information
- prior knowledge
- information loss
- training data
- data quality
- data sources
- significant improvement
- objective function
- statistical analysis
- clustering method
- high quality
- neural network
- selection algorithm
- database
- selection strategy
- fully automatic
- feature set
- supervised learning
- knowledge discovery
- probabilistic model
- dynamic programming
- pairwise
- preprocessing
- similarity measure