Keyphrases
- text classification
- feature selection
- text categorization
- real world
- n gram
- bag of words
- naive bayes
- machine learning
- document classification
- multi label
- mutual information
- text mining
- labeled data
- knn
- information theory
- data cleaning
- minimum error
- text documents
- sentiment analysis
- data quality
- shannon entropy
- information entropy
- semantic features
- text data
- website
- knowledge discovery
- k nearest neighbor