Recherche de documents similaires sur le web par segmentations hiérarchiques et extraction de mots-clés.
Alain Simac-LejeunePublished in: EGC (2013)
Keyphrases
- web documents
- web data
- web information
- information retrieval
- multilingual documents
- information extraction
- website
- web information extraction
- newspaper articles
- web pages
- data extraction
- information retrieval systems
- web applications
- text information
- content similarity
- extraction rules
- digital documents
- web mining
- textual data
- text documents
- open directory project
- document repositories
- ground truth
- web content
- metadata
- web users
- image segmentation
- document retrieval
- relevant documents
- document collections
- medical images
- end users
- document clustering
- google scholar
- information sources
- textual features
- xml documents
- search click data
- answering questions
- document classification
- user interests
- web search engines
- keywords
- search engine