Investigating Usage of Text Segmentation and Inter-passage Similarities to Improve Text Document Clustering.
Shashank PaliwalVikram PudiPublished in: MLDM (2012)
Keyphrases
- text segmentation
- document clustering
- document set
- text mining
- text documents
- document collections
- vector space model
- clustering method
- k means
- clustering algorithm
- similarity measure
- document representation
- question answering
- tf idf
- sentence level
- information extraction
- language model
- text categorization
- data mining
- association rules
- character recognition
- information retrieval