TFIDF, LSI and multi-word in information retrieval and text categorization.
Wen ZhangTaketoshi YoshidaXijin J. TangPublished in: SMC (2008)
Keyphrases
- text categorization
- text clustering
- multiword
- latent semantic indexing
- information retrieval
- term weighting
- text classification
- text collections
- document representation
- text documents
- context sensitive
- knn
- feature selection
- tf idf
- multi label
- vector space model
- language model
- document classification
- k nearest neighbor
- text retrieval
- term frequency
- document collections
- feature weighting
- semi supervised learning
- text mining
- document retrieval
- vector space
- part of speech
- data fusion
- language modeling
- document clustering
- singular value decomposition
- test collection
- term weighting schemes
- neural network
- textual data
- web documents
- prior knowledge
- high dimensional
- search engine
- data mining