Term Selection with Distributional Clustering for Chinese Text Categorization using N-grams.
Jyh-Jong TsayJing-Doo WangPublished in: ROCLING (2) (1999)
Keyphrases
- term selection
- n gram
- distributional clustering
- text categorization
- text classification
- bag of words
- feature selection
- naive bayes
- document frequency
- knn
- text documents
- k nearest neighbor
- information gain
- language model
- part of speech
- text mining
- labeled data
- machine learning
- language modeling
- term weighting
- tf idf
- cross language
- semi supervised learning
- pairwise
- information retrieval