A throughput optimal algorithm for map task scheduling in mapreduce with data locality.
Weina WangKai ZhuLei YingJian TanLi ZhangPublished in: SIGMETRICS Perform. Evaluation Rev. (2013)
Keyphrases
- dynamic programming
- input data
- learning algorithm
- optimal solution
- data sets
- detection algorithm
- worst case
- globally optimal
- data analysis
- objective function
- preprocessing
- training data
- data points
- data reduction
- search space
- knowledge discovery
- lower bound
- np hard
- database
- noisy data
- computational complexity
- data sources
- k means
- probability distribution
- em algorithm
- similarity measure
- maximum a posteriori
- big data
- information loss
- exhaustive search
- data streams
- synthetic datasets