Thematic clustering of text documents using an EM-based approach.
Sun KimW. John WilburPublished in: J. Biomed. Semant. (2012)
Keyphrases
- text documents
- document clustering
- text mining
- text clustering
- k means
- text classification
- unsupervised learning
- text categorization
- wordnet
- clustering algorithm
- keywords
- document classification
- information extraction
- topic models
- named entities
- text data
- tf idf
- clustering method
- hierarchical clustering
- bag of words
- text collections
- data points
- high dimensional data
- generative model
- expectation maximization
- automatic text categorization
- natural language processing
- web documents
- maximum likelihood
- co occurrence
- clustering quality
- computer vision
- information retrieval