Text document classification based on mixture models.
Jana NovovicováAntonín MalíkPublished in: Kybernetika (2004)
Keyphrases
- document classification
- mixture model
- text documents
- text mining
- web documents
- gaussian mixture model
- em algorithm
- text categorization
- density estimation
- text classification
- probabilistic model
- generative model
- model selection
- keywords
- expectation maximization
- language model
- classification algorithm
- maximum likelihood
- unsupervised learning
- topic models
- information extraction
- document clustering
- information retrieval
- text analysis
- mixture modeling
- automatic model selection
- databases
- data analysis
- named entities
- bag of words
- knowledge discovery
- decision trees
- neural network
- data sets