PClean: Bayesian Data Cleaning at Scale with Domain-Specific Probabilistic Programming.
Alexander K. LewMonica AgrawalDavid A. SontagVikash K. MansinghkaPublished in: CoRR (2020)
Keyphrases
- data cleaning
- domain specific
- data integration
- bayesian networks
- text classification
- outlier detection
- record linkage
- data quality
- database
- data processing
- general purpose
- fraud detection
- data warehousing
- domain experts
- data mining
- databases
- data sets
- web usage mining
- information retrieval
- data warehouse
- database management systems
- machine learning
- decision making