Comparison of semantic and single term similarity measures for clustering turkish documents.
Bülent YücesoySule Gündüz ÖgüdücüPublished in: ICMLA (2007)
Keyphrases
- cosine similarity
- similarity measure
- semantic similarity
- document clustering
- document space
- text representation
- similarity function
- vector space model
- semantic relationships
- information retrieval
- document representation
- semantic information
- clustering method
- k means
- text documents
- distance measure
- query terms
- semantically related
- clustering algorithm
- text clustering
- related documents
- linguistic analysis
- information retrieval systems
- euclidean distance
- unstructured documents
- tf idf
- document collections
- term frequency
- co occurrence
- text mining
- term document matrix
- cosine measure
- keywords
- semantic classes
- document content
- natural language
- data points
- similarity computation
- semantic content
- text categorization
- dissimilarity measure
- web documents
- term weights
- feature vectors
- retrieved documents
- relevant documents
- vector space