Clustering Documents with Maximal Substrings.
Tomonari MasadaAtsuhiro TakasuYuichiro ShibataKiyoshi OguriPublished in: ICEIS (2011)
Keyphrases
- data structure
- document clustering
- clustering method
- text clustering
- document collections
- clustering algorithm
- self organizing maps
- information retrieval systems
- information retrieval
- k means
- web documents
- text documents
- document classification
- xml documents
- topic detection
- document retrieval
- data clustering
- vector space
- information theoretic
- document clusters
- unsupervised learning
- metadata
- cosine similarity
- data objects
- spectral clustering
- hierarchical clustering
- relevant documents
- data points
- vector space model
- fuzzy clustering
- database
- probabilistic model
- document representation
- edit distance
- text mining
- distance measure
- text categorization