A new similarity measure for vector space models in text classification and information retrieval.
Mete EminagaogluPublished in: J. Inf. Sci. (2022)
Keyphrases
- vector space model
- text classification
- information retrieval
- cosine measure
- similarity measure
- semantic similarity
- text mining
- retrieval model
- language modeling
- language model
- vector space
- tf idf
- text categorization
- n gram
- feature selection
- agglomerative hierarchical clustering
- web documents
- latent semantic indexing
- document clustering
- ir models
- bag of words
- machine learning
- knn
- feature vectors
- distance measure
- document representation
- co occurrence
- information retrieval systems
- document collections
- test collection
- semantic information
- term frequency
- term weighting
- relevance model
- text documents
- information extraction
- trec collections
- user queries