OTClean: Data Cleaning for Conditional Independence Violations using Optimal Transport.
Alireza PirhadiMohammad Hossein MoslemiAlexander CloningerMostafa MilaniBabak SalimiPublished in: Proc. ACM Manag. Data (2024)
Keyphrases
- conditional independence
- data cleaning
- bayesian networks
- graphical models
- random variables
- probability distribution
- directed acyclic graph
- data quality
- record linkage
- data integration
- structure learning
- causal models
- data processing
- text classification
- outlier detection
- active learning
- fraud detection
- database
- axiomatic characterization
- data warehousing
- data model
- web usage mining
- knowledge discovery
- machine learning