Utilizing Term Proximity based Features to Improve Text Document Clustering.
Shashank PaliwalVikram PudiPublished in: KDIR (2011)
Keyphrases
- document clustering
- text documents
- text mining
- text clustering
- clustering algorithm
- document representation
- feature vectors
- information retrieval
- text classification
- text data
- vector space model
- cluster analysis
- k means
- natural language processing
- co occurrence
- information extraction
- active learning
- machine learning