A New Document Clustering Algorithm for Topic Discovering and Labeling.
Henry Anaya-SánchezAurora Pons-PorrataRafael Berlanga LlavoriPublished in: CIARP (2008)
Keyphrases
- clustering algorithm
- document clustering
- document content
- document set
- tolerance rough set
- text clustering
- topic discovery
- topic models
- topic hierarchy
- document images
- latent topics
- information retrieval
- image segmentation
- vector space model
- web documents
- textual content
- k means
- active learning
- statistical topic models
- document representation
- spectral clustering
- information retrieval systems
- clustering method
- retrieval systems
- text documents
- fuzzy c means
- keywords
- focused crawler
- topic specific
- document retrieval
- labeling scheme
- multi document summarization
- latent dirichlet allocation
- expert finding
- related documents
- scientific papers
- semantic information
- cross document
- news articles
- label assignment
- data clustering