Using Micro-Documents for Feature Selection: The Case of Ordinal Text Classification.
Stefano BaccianellaAndrea EsuliFabrizio SebastianiPublished in: IIR (2011)
Keyphrases
- text classification
- feature selection
- text documents
- document classification
- text classifiers
- text categorization
- labeled documents
- text data
- document categorization
- training documents
- term frequency
- training corpus
- bag of words
- naive bayes
- text mining
- classify documents
- data cleaning
- labeled data
- distributional clustering
- document clustering
- sentiment classification
- information retrieval systems
- machine learning
- information retrieval
- feature weighting
- automatic text classification
- sentiment analysis
- feature space
- feature engineering
- semantic features
- metadata
- feature reduction
- mutual information
- unlabeled data
- document collections
- web documents
- n gram
- multi label
- relevant documents
- semantic information
- co occurrence
- vector space model
- classification accuracy
- decision trees
- language modeling