A Hierarchical Extraction Policy for content extraction signatures: Selectively handling verifiable digital content.
Laurence BullDavid McG. SquireYuliang ZhengPublished in: Int. J. Digit. Libr. (2004)
Keyphrases
- text documents
- content extraction
- digital content
- digital libraries
- web news
- text classification
- digital information
- digital archives
- multimedia content
- metadata
- text content
- html documents
- multimedia information retrieval
- automatic extraction
- cross media
- database
- news pages
- cultural heritage
- data sources
- low level
- data analysis
- multimedia
- search engine