Little Words Can Make a Big Difference for Text Classification.
Ellen RiloffPublished in: SIGIR (1995)
Keyphrases
- text classification
- n gram
- text documents
- training corpus
- distributional clustering
- text categorization
- bag of words
- training documents
- word segmentation
- machine learning
- text representation
- text mining
- automatic text classification
- text data
- naive bayes
- feature selection
- knn
- english words
- data cleaning
- document classification
- text classifiers
- document representation
- term frequency
- semantic features
- part of speech
- sentiment classification
- decision trees
- related words
- multiword
- multi label
- unlabeled data
- tf idf
- language modeling
- feature space
- big data
- keywords
- information theoretic
- training data
- hidden markov models
- labeled data
- neural network