A Data Deduplication Scheme Based on DBSCAN With Tolerable Clustering Deviation.
Yan TengHequn XianQuanli LuFeng GuoPublished in: IEEE Access (2023)
Keyphrases
- data points
- clustering algorithm
- database
- data analysis
- prior knowledge
- statistical analysis
- data collection
- data sets
- databases
- data structure
- clustering analysis
- high dimensional data
- raw data
- clustering method
- original data
- categorical data
- data mining tasks
- distributed data
- high dimensional datasets
- synthetic data
- knowledge discovery
- training data
- spatial data
- data mining algorithms
- data distribution
- data mining techniques
- hierarchical clustering
- image data
- data sources
- multidimensional data
- xml documents
- clustering scheme