A machine learning approach for Arabic text classification using N-gram frequency statistics.
Laila KhreisatPublished in: J. Informetrics (2009)
Keyphrases
- n gram
- text classification
- bag of words
- feature selection
- naive bayes
- variable length
- text mining
- language independent
- text categorization
- language modelling
- sentiment analysis
- part of speech
- machine learning
- language modeling
- labeled data
- knn
- text data
- character n grams
- viterbi algorithm
- text documents
- semantic features
- text classifiers
- multi label
- language model
- cross lingual
- term frequency
- statistical language modeling
- inside outside algorithm
- word segmentation