Using proportional transportation similarity with learned element semantics for XML document clustering.
Xiaojun WanJianwu YangPublished in: WWW (2006)
Keyphrases
- document clustering
- document similarity
- cosine similarity
- xml documents
- clustering algorithm
- text mining
- document representation
- clustering method
- clustering quality
- document collections
- negative matrix factorization
- similarity measure
- xml data
- vector space model
- text documents
- distance measure
- metadata
- databases
- document clusters
- semantic information
- xml elements
- data model
- k means
- semantic similarity
- data analysis
- tolerance rough set