Comparing MapReduce-Based k-NN Similarity Joins on Hadoop for High-Dimensional Data.
Premysl CechJakub MarousekJakub LokocYasin N. SilvaJeremy StarksPublished in: ADMA (2017)
Keyphrases
- index structure
- nearest neighbor
- high dimensional data
- knn
- similarity join
- similarity search
- k nearest neighbor
- distance computation
- metric space
- high dimensional data sets
- range queries
- distance function
- r tree
- data distribution
- similarity queries
- nearest neighbor search
- indexing techniques
- text classification
- feature selection
- locality sensitive hashing
- edit distance
- support vector machine
- data sets
- similarity measure
- data points
- dimensionality reduction
- vector space
- face recognition
- high dimensional
- low dimensional
- multi step
- data management
- similar objects
- multi dimensional
- data mining
- text mining