Boosting Text Classification through Stemming of Composite Words.
Marenglen BibaEva GjatiPublished in: ISI (2013)
Keyphrases
- text classification
- n gram
- feature selection
- text documents
- bag of words
- text categorization
- distributional clustering
- word segmentation
- text mining
- language independent
- machine learning
- language modeling
- training corpus
- language model
- data cleaning
- stop words
- naive bayes
- multi label
- sentiment analysis
- character n grams
- learning algorithm
- sentiment classification
- information extraction
- part of speech
- unlabeled data
- text classifiers
- semantic features
- text data
- ensemble learning
- support vector machine
- information theoretic
- data analysis
- knn
- training documents
- labeled data
- multi class