Enabling persistent identification of groups of duplicates in data aggregators.
Giorgos AlexiouMarios MeimarisGeorge PapastefanatosPublished in: ICDE Workshops (2016)
Keyphrases
- data sets
- data structure
- experimental data
- high quality
- raw data
- data analysis
- data sources
- data collection
- data quality
- small number
- original data
- end users
- statistical analysis
- probability distribution
- historical data
- previously identified
- network structure
- big data
- real time
- application domains
- sensor data
- missing data
- data processing
- data mining techniques
- data management
- data points
- training data
- website
- learning algorithm
- information retrieval