Using XPath to Discover Informative Content Blocks of Web Pages.
Yan FuDongqing YangShiwei TangTengjiao WangJun GaoPublished in: SKG (2007)
Keyphrases
- web pages
- web content
- dom tree
- web documents
- xml documents
- textual content
- dynamic content
- search engine
- website
- web resources
- query evaluation
- hyperlink structure
- browsing experience
- metadata
- query language
- xml data
- content features
- dynamically generated
- transitive closure
- web page classification
- web search engines
- page content
- keywords
- social bookmarking
- data records
- news topics
- link structure
- web search
- information content
- news web sites
- information retrieval
- unstructured information
- html pages
- data extraction
- xml queries
- social media