Finding the Minimum Document Length for Reliable Clustering of Multi-Document Natural Language Corpora.
Hermann MoislPublished in: J. Quant. Linguistics (2011)
Keyphrases
- natural language
- natural language processing
- clustering method
- clustering algorithm
- document length
- k means
- cluster analysis
- unsupervised learning
- text classification
- smoothing methods
- text data
- document clustering
- retrieval systems
- question answering
- machine learning
- language model
- text mining
- high dimensional
- feature selection
- search engine