On the Impact of Document Representation on Classifier Per-formance in e-Mail Categorization.
Helmut BergerMonika KöhleDieter MerklPublished in: ISTA (2005)
Keyphrases
- document representation
- document categorization
- bag of words
- document clustering
- document collections
- text categorization
- language model
- feature selection
- text documents
- data fusion
- vector space model
- vector space
- semantic information
- training data
- web documents
- text classification
- document content
- text mining
- information retrieval
- n gram
- feature space
- action recognition
- image classification
- information retrieval systems
- named entities
- information extraction
- data analysis
- evaluation measures
- digital libraries
- metadata
- computer vision