Reconnaissance et extraction de documents. Une application industrielle à la détection de documents semi-structurés.
Olivier AugereauNicholas JournetJean-Philippe DomengerPublished in: Document Numérique (2013)
Keyphrases
- information retrieval systems
- web documents
- document collections
- relevant documents
- information retrieval
- free text
- xml documents
- text documents
- vector space model
- document classification
- keywords
- legal documents
- information extraction
- document retrieval
- text analysis
- database
- text categorization
- user queries
- knn
- text retrieval
- retrieval effectiveness
- website
- web pages
- feature selection
- structured documents
- time stamped
- document content