Classification de documents XML à partir d'une représentation linéaire des arbres de ces documents.
Anne-Marie VercoustreMounir FegasYves LechevallierThierry DespeyrouxPublished in: EGC (2006)
Keyphrases
- xml documents
- document classification
- metadata
- document collections
- semi structured documents
- xml format
- information retrieval
- document structure
- web documents
- automatic categorization
- document centric
- database
- document clustering
- relevant documents
- machine learning
- document retrieval
- decision trees
- text classification
- feature extraction
- image classification
- information retrieval systems
- electronic documents
- document categorization
- document representation
- semantic information
- feature selection
- automatic classification
- free text
- text documents
- class labels
- relational databases
- support vector
- content and structure
- xml schema
- keywords
- extensible markup language
- classification algorithm
- classify documents