Handling Data Skew for Aggregation in Spark SQL Using Task Stealing.
Zeyu HeQiuli HuangZhifang LiChuliang WengPublished in: Int. J. Parallel Program. (2020)
Keyphrases
- data skew
- data distribution
- load balancing
- parallel database systems
- relational databases
- query language
- parallel processing
- database design
- join operations
- relational database systems
- database applications
- databases
- database queries
- database
- join algorithms
- relational model
- database technology
- sql queries
- skewed data
- data streams
- data points
- database systems
- data management