Text Categorization in R: A Reduced N-Gram Approach.
Wilhelm M. GeigerJohannes RauchPatrick MairKurt HornikPublished in: GfKl (2010)
Keyphrases
- text categorization
- n gram
- text classification
- language model
- language independent
- feature selection
- text documents
- naive bayes
- multi label
- bag of words
- text mining
- knn
- k nearest neighbor
- information gain
- document classification
- language modeling
- part of speech
- term frequency
- machine learning
- text classifiers
- unlabeled data
- databases
- viterbi algorithm
- labeled data
- information retrieval
- web documents
- knowledge representation
- artificial intelligence