A la Recherche de noeuds informatifs dans des corpus de documents XML.
Karen SauvagnatMohand BoughanemPublished in: CORIA (2005)
Keyphrases
- xml documents
- xml format
- semi structured documents
- document structure
- document centric
- metadata
- word frequencies
- person names
- extensible markup language
- newspaper articles
- structured documents
- xml data
- document repository
- document collections
- xml queries
- information retrieval systems
- data interchange
- xpath expressions
- training corpus
- document level
- multiword
- text corpora
- document representation
- natural language text
- information retrieval
- web documents
- xml schema
- relational databases
- xpath queries
- semi structured data
- information extraction
- retrieval systems
- electronic documents
- structured data
- markup language
- relevant documents
- text documents
- data model
- document retrieval
- data exchange
- similar documents
- semi structured
- xml trees
- text data
- text collections
- named entities
- content and structure
- document clustering
- word frequency
- vector space model
- xml retrieval
- search engine
- databases