Beyond tf-idf and Cosine Distance in Documents Dissimilarity Measure.
Sunil AryalKai Ming TingGholamreza HaffariTakashi WashioPublished in: AIRS (2015)
Keyphrases
- tf idf
- cosine distance
- dissimilarity measure
- text documents
- document clustering
- vector space model
- information retrieval
- clustering method
- edit distance
- distance metric
- text mining
- distance measure
- retrieval model
- text categorization
- similarity measure
- vector space
- document collections
- web documents
- information extraction
- text classification
- document retrieval
- euclidean distance
- feature space
- retrieval systems
- distance function
- information retrieval systems
- clustering algorithm
- semantic information
- ranking algorithm
- semantic similarity
- wordnet
- keywords
- natural language
- topic models
- image classification
- database systems
- data sets