DocCat: un composant logiciel de catégorisation de documents et de marquage sémantique XML.
Georges GardarinHuaizhong KouKarine ZeitouniPublished in: Ingénierie des Systèmes d Inf. (2003)
Keyphrases
- xml documents
- xml format
- semi structured documents
- document centric
- document structure
- metadata
- extensible markup language
- xml data
- structured documents
- information retrieval
- standard for data exchange
- document collections
- document repository
- document type
- data exchange
- xml schema
- electronic documents
- information retrieval systems
- document classification
- xml files
- xml databases
- document clustering
- data model
- markup language
- structured data
- document retrieval
- xml fragments
- free text
- xml queries
- semi structured data
- content and structure
- relational databases
- data integration
- digital libraries
- keyword search
- semi structured
- text documents
- co occurrence
- logical structure
- xml retrieval
- vector space model
- database systems
- user queries
- semantic information
- web search engines
- database