Text Categorization for Vietnamese Documents.
Giang-Son NguyenXiaoying GaoPeter AndreaePublished in: Web Intelligence/IAT Workshops (2009)
Keyphrases
- text categorization
- text documents
- document classification
- automatic categorization
- automatic text categorization
- training documents
- text classifiers
- document categorization
- term frequency
- text collections
- text classification
- document representation
- classify documents
- feature selection
- distributional clustering
- term weighting
- term selection
- multi label
- naive bayes
- knn
- word frequency
- document clustering
- semi supervised learning
- reuters corpus
- information gain
- tf idf
- k nearest neighbor
- text data
- document frequency
- information retrieval
- bag of words
- document collections
- information retrieval systems
- pairwise
- document set
- vector space model
- document retrieval
- n gram
- feature vectors
- machine learning
- feature selections
- feature selection for text categorization