Superior Parallel Big Data Clustering Through Competitive Stochastic Sample Size Optimization in Big-Means.
Rustam MussabayevRavil MussabayevPublished in: ACIIDS (2) (2024)
Keyphrases
- big data
- sample size
- data analysis
- cloud computing
- data processing
- data management
- model selection
- social media
- upper bound
- random sampling
- knowledge discovery
- small samples
- statistical hypothesis testing
- business intelligence
- small sample
- vast amounts of data
- big data analytics
- data warehousing
- worst case
- data science
- parallel processing
- special case
- random sample
- commodity hardware
- real world
- data points
- variance reduction
- data streams
- statistical power
- e learning
- feature selection
- machine learning