PClean: Bayesian Data Cleaning at Scale with Domain-Specific Probabilistic Programming.
Alexander K. LewMonica AgrawalDavid A. SontagVikash MansinghkaPublished in: AISTATS (2021)
Keyphrases
- data cleaning
- domain specific
- data integration
- bayesian networks
- text classification
- record linkage
- outlier detection
- data quality
- database
- general purpose
- data warehousing
- web usage mining
- data warehouse
- data processing
- information extraction
- fraud detection
- integrity constraints
- query evaluation
- object oriented
- machine learning
- domain experts
- missing values
- text mining
- case study