Word importance-based similarity of documents metric (WISDM): Fast and scalable document similarity metric for analysis of scientific documents.
Viktor BotevKaloyan MarinovFlorian SchäferPublished in: WOSP@JCDL (2017)
Keyphrases
- similarity metric
- scientific documents
- content similarity
- similarity metrics
- related documents
- similarity measure
- similarity measurement
- information retrieval systems
- information retrieval
- euclidean metric
- digital libraries
- web documents
- document analysis
- pdf documents
- document clustering
- text documents
- document retrieval
- document images
- document collections
- xml documents
- keywords
- vector space model
- document representation
- relevant documents
- distance function
- text categorization
- printed documents
- metadata
- search engine