基于词条与语意差异度量的文档聚类算法 (Term and Semantic Difference Metric Based Document Clustering Algorithm).
Linjing WeiZhichao LianLianguo WangZhenxing HouPublished in: 计算机科学 (2016)
Keyphrases
- clustering algorithm
- document clustering
- text representation
- document representation
- semantic information
- document space
- document content
- cosine similarity
- vector space model
- tolerance rough set
- clustering method
- text clustering
- semantic features
- query terms
- latent semantic
- text documents
- information retrieval
- retrieval systems
- k means
- term weights
- related documents
- index terms
- term weighting
- term frequency
- keywords
- distance measure
- document type
- document images
- semantic similarity
- document collections
- semantically related
- semantic web
- document retrieval
- inverted lists
- term dependence
- natural language
- information retrieval systems
- term weighting schemes
- relevant documents
- text classification
- bag of words
- distance function
- tf idf
- unstructured documents
- web documents