SelfClean: A Self-Supervised Data Cleaning Strategy.
Fabian GrögerSimone LionettiPhilippe GottfroisÁlvaro González-JiménezLudovic AmruthalingamLabelling ConsortiumMatthew GrohAlexander A. NavariniMarc PoulyPublished in: CoRR (2023)
Keyphrases
- data cleaning
- data integration
- text classification
- outlier detection
- record linkage
- data quality
- data processing
- missing values
- database
- data warehousing
- fraud detection
- web usage mining
- data warehouse
- information extraction
- decision support system
- knowledge management
- text mining
- privacy preserving
- knowledge discovery
- data sets