Bootstrap Sampling Based Data Cleaning and Maximum Entropy SVMs for Large Datasets.
Senzhang WangZhoujun LiXiaoming ZhangPublished in: ICTAI (2012)
Keyphrases
- maximum entropy
- data cleaning
- data integration
- text classification
- database
- data quality
- support vector
- outlier detection
- record linkage
- maximum entropy principle
- feature selection
- linear svm
- multi class
- data processing
- conditional random fields
- data warehousing
- fraud detection
- machine learning
- knn
- data sets
- website
- web usage mining
- training data
- data warehouse
- integrity constraints
- image segmentation
- association rules
- information extraction
- data model