QueryER: A Framework for Fast Analysis-Aware Deduplication over Dirty Data.
Giorgos AlexiouGeorge PapastefanatosVassilis StamatopoulosGeorgia KoutrikaNectarios KozirisPublished in: EDBT (2025)
Keyphrases
- data analysis
- data sets
- statistical analysis
- data collection
- synthetic data
- input data
- correlation analysis
- training data
- complex data
- heterogeneous sources
- main contribution
- data sources
- high quality
- missing data
- database
- data cleaning
- data mining
- data visualization
- data objects
- original data
- raw data
- data acquisition
- knowledge discovery
- relational databases
- xml documents
- labeled data
- prior knowledge
- computer systems
- data processing
- data points