An automatic wrapper generation process for large scale crawling of news websites.
Iraklis VarlamisNikos TsirakisVassilis PoulopoulosPanayiotis TsantilasPublished in: Panhellenic Conference on Informatics (2014)
Keyphrases
- generation process
- web pages
- website
- feature selection
- web crawler
- real world
- real life
- web mining
- news articles
- small scale
- search engine
- information retrieval
- data mining
- fully automatic
- black box
- data extraction
- wrapper induction
- semi automatic
- online news
- resource discovery
- news pages
- data sets
- social media
- digital libraries
- video sequences
- neural network