BSDP: A Novel Balanced Spark Data Partitioner.
Aibo SongBowen PengJingyi QiuYingying XueMingyang DuPublished in: ICPADS (2021)
Keyphrases
- data sets
- database
- complex data
- experimental data
- high quality
- data structure
- original data
- statistical analysis
- data quality
- raw data
- network structure
- missing values
- decision trees
- synthetic data
- high dimensional data
- training data
- image data
- end users
- prior knowledge
- computer systems
- data collection
- information sources
- small number
- missing data
- probability distribution
- data sources
- data model
- data objects
- data analysis
- feature selection