Exploiting probabilistic topic models to improve text categorization under class imbalance.
Enhong ChenYanggang LinHui XiongQiming LuoHaiping MaPublished in: Inf. Process. Manag. (2011)
Keyphrases
- text categorization
- class imbalance
- feature selection
- text classification
- multi label
- naive bayes
- active learning
- knn
- k nearest neighbor
- class distribution
- cost sensitive
- text documents
- text classifiers
- semi supervised learning
- topic models
- text collections
- unlabeled data
- data mining
- multi class
- learning algorithm
- query expansion
- tf idf
- information retrieval