SparkDQ: Efficient generic big data quality management on distributed data-parallel computation.
Rong GuYang QiTongyu WuZhaokang WangXiaolong XuChunfeng YuanYihua HuangPublished in: J. Parallel Distributed Comput. (2021)
Keyphrases
- big data
- distributed data
- parallel computation
- quality management
- cloud computing
- data analysis
- data sharing
- parallel implementation
- business intelligence
- data management
- real time
- parallel algorithm
- quality assessment
- parallel computing
- data processing
- water quality
- data quality
- parallel processing
- data warehousing
- communication cost
- software engineering
- management system
- social media
- e learning
- fuzzy theory
- social networks