Automatic Extraction of Textual Elements from News Web Pages.
Hossam IbrahimKareem DarwishAbdel-Rahim MadanyPublished in: LREC (2008)
Keyphrases
- automatic extraction
- keywords
- web pages
- html documents
- html pages
- search engine
- website
- textual contents
- wrapper generation
- plain text
- relation extraction
- web information extraction
- news web sites
- news articles
- natural language text
- web page classification
- web documents
- web server
- web content
- multimedia
- web search engines
- web search
- web users
- news topics
- text content
- link analysis
- textual content
- textual data
- textual features
- term extraction
- dynamically generated
- web content mining
- google search engine
- news events
- domain specific
- web communities
- anchor text
- textual information
- hierarchical structure
- link structure
- web graph