BETULA: Fast clustering of large data with improved BIRCH CF-Trees.
Andreas LangErich SchubertPublished in: Inf. Syst. (2022)
Keyphrases
- data points
- clustering algorithm
- data processing
- input data
- data sets
- synthetic data
- high dimensional data
- clustering method
- data collection
- multidimensional data
- data quality
- training data
- database
- clustering result
- categorical data
- data objects
- raw data
- data distribution
- data sources
- databases
- decision trees
- data structure
- image data
- data analysis
- experimental data
- social networks
- document clustering
- data mining techniques
- probability distribution