Hadoop Based Scalable Cluster Deduplication for Big Data.
Qing LiuYinjin FuGuiqiang NiRui HouPublished in: ICDCS Workshops (2016)
Keyphrases
- big data
- cloud computing
- mapreduce framework
- data intensive
- commodity hardware
- data intensive computing
- data management
- business intelligence
- unstructured data
- data analysis
- social media
- massive data
- data processing
- high volume
- knowledge discovery
- data science
- vast amounts of data
- social computing
- data analytics
- big data analytics
- map reduce
- distributed computing
- data warehousing