Expériences de classification d'une collection de documents XML de structure homogène.
Thierry DespeyrouxYves LechevallierBrigitte TrousseAnne-Marie VercoustrePublished in: EGC (2005)
Keyphrases
- xml documents
- document collections
- automatic categorization
- document classification
- document structure
- content and structure
- classification accuracy
- metadata
- logical structure
- machine learning
- xml data
- xml format
- information retrieval systems
- pre classified
- semi structured documents
- text collections
- database
- relevant documents
- classification algorithm
- keywords
- text categorization
- relational databases
- document categorization
- decision trees
- information retrieval
- structured documents
- support vector machine
- automatic classification
- text documents
- document repositories
- document clustering
- web documents
- databases
- feature extraction
- support vector
- document centric
- query language
- extensible markup language
- image classification
- text classification
- xml schema
- document retrieval
- data exchange
- xml retrieval
- semi structured
- markup language
- xml databases