Imbalanced Chinese Text Classification Based on Weighted Sampling.
Hu LiPeng ZouWeihong HanRongze XiaPublished in: ISCTCS (2013)
Keyphrases
- text classification
- word segmentation
- imbalanced data
- text categorization
- bag of words
- feature selection
- text mining
- text data
- naive bayes
- random sampling
- machine learning
- sentiment analysis
- knn
- minority class
- semantic features
- monte carlo
- unlabeled data
- labeled data
- n gram
- data cleaning
- imbalanced class distribution
- multi label
- text documents
- sampling algorithm
- text classifiers
- class distribution
- feature space
- feature reduction
- learning algorithm