Classification of document pages using structure-based features.
Christian K. ShinDavid S. DoermannAzriel RosenfeldPublished in: Int. J. Document Anal. Recognit. (2001)
Keyphrases
- classification accuracy
- feature vectors
- feature set
- feature extraction
- classification method
- feature space
- classification process
- classification models
- document classification
- benchmark datasets
- decision trees
- pattern recognition
- svm classifier
- class labels
- search engine
- extracted features
- feature selection
- support vector machine svm
- structural features
- support vector machine
- classification algorithm
- support vector
- web pages
- feature analysis
- web documents
- feature subset
- structural information
- textual content
- extracting features
- automatic classification
- feature selection algorithms
- text classification
- semantic information
- image features
- content features
- keywords
- textual features
- feature values
- machine learning
- information retrieval systems
- text categorization