Text Classification Performance: Is the Sample Size the Only Factor to be Considered?
Rosa L. FigueroaQing Zeng-TreitlerPublished in: MedInfo (2013)
Keyphrases
- sample size
- text classification
- model selection
- random sampling
- small sample size
- upper bound
- covariance matrix
- statistical tests
- small sample
- feature selection
- variance reduction
- statistical hypothesis testing
- multi label
- text mining
- experimental design
- progressive sampling
- statistical power
- small samples
- pac learning
- vc dimension
- confidence intervals
- worst case
- text categorization
- knn
- machine learning
- labeled data
- number of training samples
- naive bayes
- knowledge acquisition
- data points
- data streams
- unlabeled data
- semi supervised learning