Keyphrases
- high recall
- content extraction
- high precision
- text content
- html documents
- web news
- web documents
- precision and recall
- web pages
- information retrieval
- structured documents
- information retrieval systems
- document collections
- keywords
- web content
- document clustering
- text corpus
- automatic extraction
- semi structured
- document retrieval
- document representation
- database
- semistructured data
- multimedia information retrieval
- text documents
- retrieval systems
- digital archives
- text mining
- xml documents
- data mining