R-tfidf, a Variety of tf-idf Term Weighting Strategy in Document Categorization.
Dengya ZhuJitian XiaoPublished in: SKG (2011)
Keyphrases
- term weighting
- tf idf
- document categorization
- text categorization
- vector space model
- text documents
- term frequency inverse document frequency
- document clustering
- term frequency
- inverse document frequency
- term weighting schemes
- text classification
- information retrieval
- document representation
- latent semantic indexing
- term weights
- information extraction
- feature selection
- knn
- text retrieval
- retrieval model
- text mining
- vector space
- k nearest neighbor
- language model
- keywords
- retrieval systems
- semantic similarity
- language modeling
- training data
- named entities
- clustering algorithm
- clustering method
- document collections
- wordnet