Similarity and Locality Based Indexing for High Performance Data Deduplication.
Wen XiaHong JiangDan FengYu HuaPublished in: IEEE Trans. Computers (2015)
Keyphrases
- data sets
- high quality
- knowledge discovery
- small number
- raw data
- data collection
- data processing
- database
- data points
- end users
- image data
- training data
- statistical analysis
- experimental data
- prior knowledge
- multi dimensional
- high dimensional
- high dimensional data
- sensor data
- data distribution
- data mining
- similarity scores