Schema-agnostic vs Schema-based Configurations for Blocking Methods on Homogeneous Data.
George PapadakisGeorge AlexiouGeorge PapastefanatosGeorgia KoutrikaPublished in: Proc. VLDB Endow. (2015)
Keyphrases
- statistical methods
- data sets
- databases
- database
- raw data
- data processing
- high quality
- data analysis
- data sources
- high dimensional data
- semantically enriched
- special features
- semistructured data
- spatial data
- data mining techniques
- data points
- data structure
- image data
- data quality
- decision trees
- training data
- external data
- data mining applications
- data objects
- original data
- missing values
- synthetic data
- data collection
- input data
- preprocessing
- xml documents
- spectral clustering
- data mining methods
- significant improvement
- database schema
- data reduction
- record linkage
- semi structured data
- data transformation
- data representations
- knowledge discovery