Feature Selection on Chinese Text Classification Using Character N-Grams.
Zhihua WeiDuoqian MiaoJean-Hugues ChauchatCaiming ZhongPublished in: RSKT (2008)
Keyphrases
- text classification
- n gram
- feature selection
- character n grams
- word segmentation
- text categorization
- bag of words
- language independent
- variable length
- labeled data
- language modeling
- text documents
- text data
- machine learning
- text mining
- text classifiers
- knn
- k nearest neighbor
- term frequency
- data mining
- transfer learning
- cross language
- language specific
- feature extraction
- cross lingual
- unlabeled data
- information retrieval
- language model