Feature Generation, Feature Selection, Classifiers, and Conceptual Drift for Biomedical Document Triage.
Aaron M. CohenRavi Teja BhupatirajuWilliam R. HershPublished in: TREC (2004)
Keyphrases
- feature generation
- feature selection
- text categorization
- feature representations
- document classification
- information extraction
- text documents
- co training
- naive bayes
- inductive learning
- feature set
- information retrieval
- support vector
- text classification
- feature construction
- text mining
- statistical approaches
- web documents
- generative model
- document collections
- word sense disambiguation
- ensemble classifier
- support vector machine
- semantic features
- feature representation
- decision trees
- training data
- classification algorithm
- document clustering
- image representation
- inductive logic programming
- machine learning methods
- machine learning
- machine learning algorithms
- knn
- ensemble learning
- training set
- classification accuracy
- multi class
- dimensionality reduction
- named entities
- concept drift
- classification models
- feature subset
- semantic information
- unlabeled data
- semi supervised learning
- knowledge base
- selection strategy
- information retrieval systems
- keywords
- data sets