OTClean: Data Cleaning for Conditional Independence Violations using Optimal Transport.
Alireza PirhadiMohammad Hossein MoslemiAlexander CloningerMostafa MilaniBabak SalimiPublished in: CoRR (2024)
Keyphrases
- conditional independence
- data cleaning
- bayesian networks
- random variables
- graphical models
- data integration
- record linkage
- probability distribution
- data quality
- outlier detection
- axiomatic characterization
- causal models
- directed acyclic graph
- data processing
- fraud detection
- database
- data warehousing
- structure learning
- text classification
- privacy preserving
- business intelligence
- web usage mining
- data sets