Wisteria: Nurturing Scalable Data Cleaning Infrastructure.
Daniel HaasSanjay KrishnanJiannan WangMichael J. FranklinEugene WuPublished in: Proc. VLDB Endow. (2015)
Keyphrases
- data cleaning
- data integration
- data quality
- outlier detection
- text classification
- record linkage
- missing values
- database
- data warehousing
- fraud detection
- data processing
- data warehouse
- information extraction
- web usage mining
- data sources
- integrity constraints
- data sets
- naive bayes
- privacy preserving
- relational databases
- natural language
- search engine
- information retrieval
- machine learning
- databases