A site oriented method for segmenting web pages.
David Fernandes de OliveiraEdleno Silva de MouraAltigran Soares da SilvaBerthier A. Ribeiro-NetoEdisson Braga AraújoPublished in: SIGIR (2011)
Keyphrases
- detection method
- website
- synthetic data
- high accuracy
- web pages
- fully automatic
- high precision
- similarity measure
- dynamic programming
- significant improvement
- optimization algorithm
- feature set
- information retrieval
- experimental evaluation
- machine learning
- prior knowledge
- training set
- preprocessing
- search engine
- segmentation method
- learning algorithm