The Curse of Zipf and Limits to Parallelization: An Look at the Stragglers Problem in MapReduce.
Jimmy LinPublished in: LSDS-IR@SIGIR (2009)
Keyphrases
- parallel processing
- high dimensional
- distributed processing
- cloud computing
- high performance data mining
- high dimensional data
- power law
- dimensionality reduction
- information retrieval
- dimension reduction
- pattern recognition
- computer vision
- distributed computing
- parallel computing
- data intensive
- parallel computation
- parallel programming
- data partitioning
- mapreduce framework
- data parallelism
- data sets