Can Statistical Tests Be Used for Feature Selection in Diachronic Text Classification?
Sanja StajnerRichard EvansPublished in: SLSP (2013)
Keyphrases
- feature extraction
- statistical tests
- text classification
- feature selection
- text categorization
- sample size
- hypothesis testing
- statistically significant
- bag of words
- naive bayes
- statistical analysis
- statistical methods
- feature set
- image classification
- machine learning
- labeled data
- mutual information
- multi class
- classification accuracy
- feature space
- independent variables
- model selection
- text classifiers
- support vector machine
- n gram
- support vector
- null hypothesis
- information gain
- statistical significance
- data cleaning
- information retrieval
- knn
- objective function
- semantic features
- feature subset
- unlabeled data
- web communities
- feature reduction
- feature engineering