BigDedup: A Big Data Integration Toolkit for Duplicate Detection in Industrial Scenarios.
Luca GagliardelliSong ZhuGiovanni SimoniniSonia BergamaschiPublished in: TE (2018)
Keyphrases
- data integration
- duplicate detection
- data cleaning
- data model
- databases
- data exchange
- data sources
- data warehouse
- data management
- biological databases
- graph search
- record linkage
- data warehousing
- query decomposition
- business intelligence
- schema mappings
- outlier detection
- data quality
- linked data
- knowledge discovery
- data extraction
- query language
- search engine
- information retrieval
- real world
- database