Parallel similarity joins on massive high-dimensional data using MapReduce.
Youzhong MaXiaofeng MengShaoya WangPublished in: Concurr. Comput. Pract. Exp. (2016)
Keyphrases
- high dimensional data
- similarity join
- similarity search
- high dimensional data sets
- data analysis
- nearest neighbor
- high dimensional
- metric space
- dimensionality reduction
- data sets
- subspace clustering
- low dimensional
- data distribution
- data points
- dimension reduction
- distance computation
- distance function
- edit distance
- uncertain data
- structural similarity
- knowledge discovery
- join algorithms
- nearest neighbor search
- image processing
- database systems
- neural network
- input data