Efficient top-k similarity document search utilizing distributed file systems and cosine similarity.
Mahmoud AlewiwiCengiz ÖrencikErkay SavasPublished in: Clust. Comput. (2016)
Keyphrases
- cosine similarity
- file system
- document search
- similarity function
- similarity measure
- distance measure
- vector space
- euclidean distance
- document clustering
- semantic similarity
- tf idf
- vector space model
- digital libraries
- k means
- pairwise
- document retrieval
- bag of words
- image retrieval
- content based search
- database systems