Cleaning by clustering: methodology for addressing data quality issues in biomedical metadata.
Wei HuAmrapali ZaveriHonglei QiuMichel DumontierPublished in: BMC Bioinform. (2017)
Keyphrases
- data quality
- data cleaning
- metadata
- data warehouse
- clustering algorithm
- data cleansing
- quality management
- quality assessment
- data transformation
- digital libraries
- databases
- scientific publications
- poor quality
- text mining
- outlier detection
- information loss
- clustering analysis
- high dimensional data
- search engine
- database
- cell suppression
- statistical information
- fuzzy clustering
- data warehousing
- information extraction
- data points
- database systems
- data mining
- data sets