To overlap or not to overlap: optimizing incremental MapReduce computations for on-demand data upload.
Stefan EneBogdan NicolaeAlexandru CostanGabriel AntoniuPublished in: DataCloud@SC (2014)
Keyphrases
- data sets
- database
- data collection
- data analysis
- training data
- data sources
- raw data
- cloud computing
- small number
- big data
- data points
- knowledge discovery
- noisy data
- complex data
- original data
- experimental data
- data distribution
- data quality
- data objects
- historical data
- high dimensional data
- statistical analysis
- computer systems
- video sequences
- decision trees
- information systems
- databases