Horizon: Scalable Dependency-driven Data Cleaning.
El Kindi RezigMourad OuzzaniWalid G. ArefAhmed K. ElmagarmidAhmed R. MahmoodMichael StonebrakerPublished in: Proc. VLDB Endow. (2021)
Keyphrases
- data cleaning
- data integration
- data quality
- outlier detection
- text classification
- record linkage
- database
- data processing
- data warehousing
- missing values
- data warehouse
- information extraction
- web usage mining
- integrity constraints
- data model
- text mining
- web mining
- fraud detection
- knn
- active learning
- social networks
- real world