Document engineering approaches toward scalable and structured multimedia, web and printable documents.
Maria da Graça Campos PimentelDick C. A. BultermanLuiz Fernando Gomes SoaresPublished in: Multim. Tools Appl. (2009)
Keyphrases
- web documents
- multimedia documents
- multimedia
- digital documents
- document collections
- information retrieval systems
- information retrieval
- relevant documents
- digital libraries
- retrieval systems
- electronic documents
- multilingual documents
- text documents
- document classification
- retrieval strategies
- document clustering
- vector space model
- web scale
- content similarity
- google scholar
- metadata
- keywords
- document retrieval
- structured documents
- document processing
- semi structured
- document representation
- text content
- document structure
- web pages
- document analysis
- textual content
- web crawler
- xml format
- relevant content
- textual features
- web data
- semi structured documents
- unstructured documents
- current web search engines
- website
- html pages
- extensible markup language
- search engine
- related documents
- structured data
- query expansion
- information extraction
- document content
- document similarity
- multimedia information
- document summarization
- text classifiers
- document set
- tf idf
- query terms
- user queries
- cross references