Unsupervised Blocking of Imbalanced Datasets for Record Matching.
Chenxiao DouDaniel SunRaymond K. WongPublished in: WISE (2) (2016)
Keyphrases
- imbalanced datasets
- class distribution
- learning from imbalanced data
- cost sensitive learning
- class imbalance
- sampling methods
- semi supervised
- imbalanced data
- decision trees
- unsupervised learning
- supervised learning
- ensemble methods
- feature selection algorithms
- neural network
- unlabeled data
- training dataset
- data sets