Superior Parallel Big Data Clustering through Competitive Stochastic Sample Size Optimization in Big-means.
Rustam MussabayevRavil MussabayevPublished in: CoRR (2024)
Keyphrases
- big data
- sample size
- cloud computing
- model selection
- data management
- data processing
- random sampling
- big data analytics
- knowledge discovery
- data analysis
- social media
- variance reduction
- small sample
- data science
- vast amounts of data
- progressive sampling
- statistical power
- statistical hypothesis testing
- small samples
- data mining
- parallel processing
- business intelligence
- upper bound
- data analytics
- knowledge management
- real world
- data driven decision making
- parallel computing
- data streams
- case study