Exploring Character Trigrams for Robust Arabic Text Classification: A Comparative Analysis in the Face of Vocabulary Expansion and Misspelled Words.
Dorieh M. AlomariIrfan AhmadPublished in: IEEE Access (2024)
Keyphrases
- n gram
- text classification
- out of vocabulary
- text documents
- bag of words
- language independent
- training corpus
- text categorization
- machine learning
- handwritten words
- text data
- language modeling
- arabic language
- sentiment analysis
- feature selection
- keywords
- knn
- automatic text classification
- text recognition
- natural language
- part of speech
- text mining
- face images
- language model
- text classifiers
- data cleaning
- word segmentation
- facial expressions
- unknown words
- probabilistic model
- printed text
- distributional clustering
- human faces