Exploiting statistical and semantic information for document clustering: An evaluation on feature selection.
Asmaa BenghabritBrahim OuhbiEl Moukhtar ZemmouriBouchra FrikhHicham BehjaPublished in: CIST (2014)
Keyphrases
- semantic information
- document clustering
- document representation
- semantic features
- vector space model
- feature selection
- wordnet
- text documents
- text mining
- document collections
- contextual information
- keywords
- semantic similarity
- tf idf
- clustering method
- background knowledge
- k means
- clustering algorithm
- domain knowledge
- high level
- text categorization
- metadata
- bag of words
- feature set
- document clusters
- text classification
- database
- low level
- xml documents
- feature space
- information retrieval
- machine learning
- knn
- feature extraction
- data mining
- databases