Data Quality Toolkit: Automatic assessment of data quality and remediation for machine learning datasets.
Nitin GuptaHima PatelShazia AfzalNaveen PanwarRuhi Sharma MittalShanmukha C. GuttulaAbhinav JainLokesh NagalapattiSameep MehtaSandeep HansPranay LohiaAniya AggarwalDiptikalyan SahaPublished in: CoRR (2021)
Keyphrases
- data quality
- automatic assessment
- machine learning
- quality management
- data preparation
- data transformation
- poor quality
- data warehouse
- quality assessment
- class noise
- privacy guarantees
- information loss
- data cleaning
- privacy preservation
- automatic analysis
- cell suppression
- decision trees
- artificial intelligence
- text classification
- data cleansing
- database
- learning systems
- active learning
- data mining