Segmentation of Web Documents and Retrieval of Useful Passages.
Carlos G. FiguerolaJosé Luis Alonso BerrocalÁngel F. Zazo RodríguezPublished in: CLEF (2007)
Keyphrases
- web documents
- structured documents
- content similarity
- information extraction
- document retrieval
- semi structured
- web pages
- textual information
- information retrieval
- web search engines
- web content
- keywords
- question answering
- focused crawling
- information retrieval systems
- html documents
- passage retrieval
- test collection
- relevance feedback
- web directories
- vector space model
- document representation
- link structure
- image retrieval
- retrieval model
- active learning
- machine learning
- unstructured documents