SEWAR: A corpus-based N-gram approach for extracting semantically-related words from Arabic medical corpus.
Rana Husni AlMahmoudBassam H. HammoPublished in: Expert Syst. Appl. (2024)
Keyphrases
- n gram
- related words
- unknown words
- part of speech
- text corpus
- word segmentation
- language model
- text classification
- bag of words
- training corpus
- semantically related
- language modeling
- text corpora
- language independent
- text documents
- morphological analysis
- computational linguistics
- keywords
- character n grams
- artificial intelligence
- news articles
- web documents
- text data
- information extraction
- feature selection