Effect of Dimensionality Reduction on Different Distance Measures in Document Clustering.
Mari-Sanna PaukkeriIlkka KivimäkiSantosh TirunagariErkki OjaTimo HonkelaPublished in: ICONIP (3) (2011)
Keyphrases
- document clustering
- distance measure
- dimensionality reduction
- dimensionality reduction methods
- cosine similarity
- euclidean distance
- text mining
- clustering method
- document representation
- distance function
- clustering algorithm
- document clusters
- document collections
- vector space model
- distance metric
- vector space
- tf idf
- similarity measure
- principal component analysis
- high dimensional data
- metric learning
- k means
- high dimensional
- low dimensional
- text documents
- information retrieval
- pattern recognition
- feature extraction
- language model
- information extraction
- data points
- feature space
- search engine