Alexandria: A Proof-of-Concept Implementation and Evaluation of Generalised Data Deduplication.
Lars NielsenRasmus VestergaardNiloofar YazdaniPrasad TalasilaDaniel E. LucaniMarton SiposPublished in: GLOBECOM Workshops (2019)
Keyphrases
- data sets
- data collection
- data sources
- statistical analysis
- xml documents
- data processing
- genetic algorithm
- record linkage
- complex data
- raw data
- experimental data
- application domains
- missing data
- synthetic data
- image data
- knowledge discovery
- prior knowledge
- data analysis
- data structure
- data distribution
- data cleaning
- metadata
- clustering algorithm
- big data
- data quality
- network structure
- test data
- data mining techniques
- training set
- probability distribution
- data points
- input data