Improving kNN Text Categorization by Removing Outliers from Training Set.
Kwangcheol ShinAjith AbrahamSang-Yong HanPublished in: CICLing (2006)
Keyphrases
- text categorization
- knn
- k nearest neighbor
- nearest neighbor
- training set
- classification algorithm
- k nearest neighbour
- k nearest
- text classification
- multi label
- information gain
- similarity search
- support vector machine
- data points
- distance function
- feature selection
- support vector machine svm
- naive bayes
- text documents
- reuters corpus
- nearest neighbour
- text classifiers
- knn algorithm
- unlabeled data
- classification accuracy
- training data
- training samples
- input space
- feature space
- knn classifier
- supervised learning
- automatic text categorization
- data sets
- training examples
- multi class
- pairwise
- data analysis
- machine learning