Beyond 100 million entities: large-scale blocking-based resolution for heterogeneous data.
George PapadakisEkaterini IoannouClaudia NiederéeThemis PalpanasWolfgang NejdlPublished in: WSDM (2012)
Keyphrases
- heterogeneous data
- data integration
- data sources
- data management
- databases
- real world
- metadata
- heterogeneous databases
- complex data
- semantic heterogeneity
- information sources
- high order co clustering
- data model
- data warehouse
- web data
- heterogeneous data sources
- decision trees
- case study
- image processing
- high dimensional data
- search engine
- machine learning