Information gain based dimensionality selection for classifying text documents.
Dumidu WijayasekaraMilos ManicMiles McQueenPublished in: IEEE Congress on Evolutionary Computation (2013)
Keyphrases
- text documents
- information gain
- text categorization
- text classification
- feature selection
- text mining
- chi squared
- decision trees
- document clustering
- naive bayes
- knn
- term frequency
- information extraction
- tf idf
- high dimensional
- document representation
- k nearest neighbor
- bag of words
- mutual information
- unlabeled data
- named entities
- data sets
- feature space
- topic models
- computer vision
- machine learning
- neural network
- wordnet
- keywords
- principal component analysis
- nearest neighbor