Keyphrases
- web documents
- web pages
- information extraction
- semi structured
- keywords
- web search engines
- knowledge discovery
- document classification
- content similarity
- focused crawling
- html documents
- unstructured documents
- web content
- data mining
- web logs
- web data
- structured documents
- textual information
- vector space model
- information retrieval