Impacts of Dirty Data on Classification and Clustering Models: An Experimental Evaluation.
Zhi-Xin QiHong-Zhi WangAn-Jie WangPublished in: J. Comput. Sci. Technol. (2021)
Keyphrases
- experimental evaluation
- data collection
- clustering analysis
- data sets
- data objects
- prior knowledge
- data sources
- high dimensional data
- data points
- probability distribution
- high dimensional
- experimental data
- clustering algorithm
- categorical data
- database
- data reduction
- data analysis
- historical data
- neural network
- statistical methods
- cluster analysis
- machine learning methods
- machine learning
- knowledge discovery
- probabilistic model
- large scale data sets
- accurate models
- data structure
- pattern recognition
- bayesian methods
- multidimensional data
- data representations
- models built
- spectral clustering
- model selection
- data mining techniques
- feature set
- supervised learning
- feature vectors
- training data
- feature extraction
- decision trees
- feature selection