Expériences de classification d'une collection de documents XML de structure homogène
Thierry DespeyrouxYves LechevallierBrigitte TrousseAnne-Marie VercoustrePublished in: CoRR (2005)
Keyphrases
- xml documents
- automatic categorization
- document collections
- document classification
- document structure
- xml format
- content and structure
- metadata
- document centric
- semi structured documents
- pre classified
- information retrieval systems
- logical structure
- information retrieval
- classification accuracy
- machine learning
- xml databases
- feature vectors
- xml queries
- automatic classification
- markup language
- document retrieval
- database
- feature space
- xml data
- data exchange
- text documents
- decision trees
- text categorization
- support vector machine
- document categorization
- databases
- classify documents
- feature extraction
- support vector
- distributed information retrieval
- text classification
- document set
- structured documents
- web documents
- structured data
- document representation
- free text