Document Representation and Dimension Reduction for Text Clustering.
M. Mahdi ShafieiSinger WangRoger ZhangEvangelos E. MiliosBin TangJane TougasRaymond J. SpiteriPublished in: ICDE Workshops (2007)
Keyphrases
- text clustering
- document representation
- principal component analysis
- document clustering
- low dimensional
- text data
- high dimensional data
- high dimensional
- vector space
- bag of words
- singular value decomposition
- feature extraction
- vector space model
- text documents
- document collections
- language model
- data fusion
- cluster analysis
- background knowledge
- semantic information
- web documents
- text mining
- feature space
- dimensionality reduction
- image classification
- nearest neighbor
- data points
- computer vision
- wordnet
- similarity search
- machine learning
- k means
- pattern recognition
- face recognition
- image segmentation
- clustering algorithm
- image processing
- feature selection
- learning algorithm
- information retrieval