Tab-Cleaner: Weakly Supervised Tabular Data Cleaning via Pre-training for E-commerce Catalog.
Kewei ChengXian LiZhengyang WangChenwei ZhangBinxuan HuangYifan Ethan XuXin Luna DongYizhou SunPublished in: ACL (industry) (2023)
Keyphrases
- weakly supervised
- data cleaning
- object detectors
- data integration
- data quality
- outlier detection
- text classification
- relation extraction
- record linkage
- semi supervised
- topic models
- object class
- superpixels
- database
- data processing
- missing values
- data warehousing
- named entities
- fraud detection
- data warehouse
- web usage mining
- databases
- electronic commerce
- unsupervised learning
- supervised learning
- object detection
- probabilistic model
- training set
- machine learning