Keyphrases
- web documents
- tree structured patterns
- information extraction
- semi structured
- web pages
- keywords
- web search engines
- document classification
- html documents
- web content
- structured documents
- vector space model
- textual information
- website
- document representation
- web data
- tree structure
- link structure
- tree patterns
- tree structures
- phylogenetic trees
- dynamically generated
- unstructured documents
- active learning
- semistructured documents