k-Balanced sorting and skew join in MPI and MapReduce.
Silu HuangAda Wai-Chee FuPublished in: IEEE BigData (2014)
Keyphrases
- data skew
- parallel computing
- parallel processing
- parallel programming
- duplicate elimination
- join algorithms
- map reduce
- high performance computing
- load balancing
- cloud computing
- parallel implementation
- message passing
- shared memory
- high performance data mining
- massively parallel
- query optimization
- parallel algorithm
- sort merge
- space partitioning
- parallelization strategy
- data distribution
- general purpose
- join operations
- cartesian product
- message passing interface
- query processing
- sql queries
- distributed memory
- query execution
- aspect ratio
- parallel computers
- distributed computing
- skewed data
- relational databases