Toward Optimal Feature Selection in Naive Bayes for Text Categorization.
Bo TangSteven KayHaibo HePublished in: CoRR (2016)
Keyphrases
- text categorization
- naive bayes
- feature selection
- text classification
- information gain
- logistic regression
- naive bayes classifier
- text classifiers
- knn
- multi label
- feature subset
- automated text categorization
- classification algorithm
- classification accuracy
- text documents
- document classification
- k nearest neighbor
- base classifiers
- decision trees
- bayes classifier
- semi supervised learning
- unlabeled data
- mutual information
- feature set
- bayesian network classifiers
- text mining
- natural language processing
- feature space
- tf idf
- term frequency
- machine learning
- neural network
- document frequency
- feature extraction
- term weighting
- support vector
- training set
- pairwise
- knowledge discovery
- nearest neighbor
- bag of words
- unsupervised learning
- labeled data