Task Scheduling for Spark Applications With Data Affinity on Heterogeneous Clusters.
Zhang XiaodongXiaoping LiHouan DuRubén RuizPublished in: IEEE Internet Things J. (2022)
Keyphrases
- data points
- data processing
- training data
- data analysis
- data distribution
- spatial data
- database
- data structure
- prior knowledge
- input data
- experimental data
- high quality
- knowledge discovery
- heterogeneous data
- statistical analysis
- high dimensional data
- complex data
- data quality
- data objects
- original data
- data samples
- synthetic data
- data mining techniques
- probability distribution
- association rules
- database systems
- clustering algorithm