QueryER: A Framework for Fast Analysis-Aware Deduplication over Dirty Data.
Giorgos AlexiouGeorge PapastefanatosVassilis StamatopoulosGeorgia KoutrikaNectarios KozirisPublished in: CoRR (2022)
Keyphrases
- data analysis
- statistical analysis
- data sets
- original data
- image data
- data processing
- computer systems
- database
- synthetic data
- knowledge discovery process
- raw data
- data collection
- input data
- prior knowledge
- small number
- high quality
- complex data
- data acquisition
- probability distribution
- data representations
- visualization tool
- data quality
- data mining
- data points
- data distribution
- experimental data
- sensor data
- probabilistic model
- clustering algorithm
- knowledge discovery
- feature space