Efficient Cross User Client Side Data Deduplication in Hadoop.
Priteshkumar PrajapatiParth ShahAmit GanatraSandipkumar PatelPublished in: J. Comput. (2017)
Keyphrases
- data sets
- end users
- data collection
- databases
- data structure
- raw data
- data sources
- data points
- high quality
- data cleaning
- big data
- data distribution
- knowledge discovery
- database systems
- user interface
- statistical analysis
- synthetic data
- data integration
- sensor data
- experimental data
- collaborative filtering
- prior knowledge
- original data
- data quality
- database
- user model
- data intensive
- user input
- data processing