K-Means Clustering with Bagging and MapReduce.
Hai-Guang LiGong-Qing WuXuegang HuJing ZhangLian LiXindong WuPublished in: HICSS (2011)
Keyphrases
- ensemble methods
- ensemble learning
- decision trees
- high performance data mining
- cloud computing
- imbalanced data
- random forest
- k means
- random forests
- machine learning
- majority voting
- training set
- neural network ensembles
- base classifiers
- ensemble classification
- meta learning
- ensemble selection
- randomized trees
- clustering algorithm
- tree ensembles
- learning machines
- parallel processing
- gradient boosting
- partitioned data
- spectral clustering
- data intensive
- distributed processing
- decision tree classifiers
- generalization error
- data clustering
- classifier ensemble
- map reduce
- variance reduction
- model averaging
- generalization ability
- voting methods