Semantically Enhanced Text Stemmer (SETS) for Cross-Domain Document Clustering.
Ivan StankovDiman TodorovRossitza SetchiPublished in: KES (Selected Papers) (2012)
Keyphrases
- document clustering
- cross domain
- semantically enhanced
- document representation
- text documents
- text mining
- text categorization
- transfer learning
- text classification
- text data
- document collections
- clustering algorithm
- language model
- information extraction
- knowledge transfer
- vector space model
- web documents
- semantic information
- information retrieval
- topic models
- named entities
- cluster analysis
- clustering method
- keywords
- target domain
- n gram
- vector space
- cross lingual
- natural language processing
- knn
- k means
- machine learning