Evaluating the Utility of Statistical Phrases and Latent Semantic Indexing for Text Classification.
Huiwen WuDimitrios GunopulosPublished in: ICDM (2002)
Keyphrases
- latent semantic indexing
- text classification
- bag of words
- document representation
- text retrieval
- singular value decomposition
- vector space
- information retrieval
- text categorization
- linguistic features
- vector space model
- labeled data
- text documents
- n gram
- text mining
- feature selection
- machine learning
- semantic features
- natural language
- high dimensional data
- domain knowledge
- digital libraries
- text data
- knowledge base