New Text Classification Strategy Based on a Word Embedding and Noise-Words Removal.
Ahmad Hussein AbabnehYousef K. SanjalawePublished in: ACIT (2023)
Keyphrases
- text classification
- n gram
- distributional clustering
- training corpus
- word segmentation
- text categorization
- related words
- information theoretic
- text documents
- statistical language modeling
- bag of words
- english words
- word recognition
- word sense disambiguation
- term frequency
- text corpus
- feature selection
- word pairs
- machine learning
- language modeling
- keywords
- word frequencies
- multi label
- printed text
- cross lingual
- word meaning
- text mining
- word frequency
- text classifiers
- lexical information
- data cleaning
- translation model
- syntactic categories
- language model
- training documents
- linguistic information
- unknown words
- word level
- sentiment classification
- co occurrence
- linguistic knowledge
- noun phrases
- part of speech
- handwritten documents
- out of vocabulary
- multiword
- latent topics
- natural language
- chinese word segmentation
- word sense
- watermarking algorithm
- word meanings
- short list
- semantic features