Word Segmentation for Vietnamese Text Categorization An Internet-based Statistic and Genetic Algorithm Approach.
Hung Nguyen ThanhKhanh Bui DoanPublished in: TALN (Posters) (2006)
Keyphrases
- text categorization
- word segmentation
- text classification
- feature selections
- knn
- n gram
- language independent
- feature selection
- k nearest neighbor
- text documents
- naive bayes
- text classifiers
- text mining
- semi supervised learning
- labeled data
- language modeling
- cross lingual
- cross language
- document analysis
- named entity recognition
- machine learning
- feature selection and classifier
- text collections
- neural network
- tf idf
- bag of words
- image analysis
- natural language