Design of an exact data deduplication cluster.
Jürgen KaiserDirk MeisterAndré BrinkmannSascha EffertPublished in: MSST (2012)
Keyphrases
- data sets
- data points
- database
- input data
- data collection
- raw data
- statistical analysis
- small number
- image data
- data quality
- noisy data
- data analysis
- data structure
- data processing
- high quality
- historical data
- complex data
- original data
- data distribution
- synthetic data
- data integrity
- databases
- experimental data
- spatial data
- prior knowledge
- clustering method
- training data
- computer systems
- knowledge discovery