Integrating Element and Term Semantics for Similarity-Based XML Document Clustering.
Jianwu YangWilliam K. CheungXiaoou ChenPublished in: Web Intelligence (2005)
Keyphrases
- document clustering
- document representation
- cosine similarity
- xml documents
- document collections
- clustering algorithm
- text documents
- text mining
- xml elements
- vector space model
- clustering method
- tf idf
- document clusters
- xml data
- semantic information
- xml retrieval
- negative matrix factorization
- databases
- tolerance rough set
- information retrieval
- cluster analysis
- digital libraries
- metadata
- latent semantic indexing
- query terms
- text classification
- knowledge base