Classification de documents combinant la structure et le contenu.
Samaneh ChagheriCatherine RousseySylvie CalabrettoCyril DumoulinPublished in: CORIA (2012)
Keyphrases
- document classification
- pattern recognition
- support vector machine
- information retrieval
- text documents
- classification method
- support vector
- classification accuracy
- text classification
- feature selection
- document structure
- pattern classification
- pre classified
- machine learning
- automatic categorization
- classification scheme
- classification algorithm
- benchmark datasets
- web documents
- xml documents
- feature vectors
- feature extraction
- decision trees
- hierarchical structure
- document clustering
- information retrieval systems
- classification models
- text classifiers
- feature space
- logical structure