Hybrid Method for Automated News Content Extraction from the Web.
Yu LiXiaofeng MengQing LiLiping WangPublished in: WISE (2006)
Keyphrases
- hybrid method
- web news
- content extraction
- news pages
- digital archives
- text content
- html documents
- hybrid algorithm
- website
- news articles
- online news
- user generated content
- web documents
- named entities
- semantic web
- support vector machine
- web pages
- cross media
- multimedia information retrieval
- optimization algorithm
- keywords