Design of Global Data Deduplication for a Scale-Out Distributed Storage System.
Myoungwon OhSejin ParkJungyeon YoonSangjae KimKang-Won LeeSage A. WeilHeon Y. YeomMyoungsoo JungPublished in: ICDCS (2018)
Keyphrases
- data sets
- high quality
- data analysis
- database
- sensor data
- data collection
- raw data
- missing data
- experimental data
- distributed data
- image data
- record linkage
- cooperative
- multi agent
- storage systems
- data transfer
- global scale
- synthetic data
- small number
- data sources
- training data
- statistical analysis
- data processing
- missing values
- scale space
- original data
- data quality
- global information
- data points
- heterogeneous databases
- data integrity
- case study
- remote sites