A fast and robust method for web page template detection and removal.
Karane VieiraAltigran Soares da SilvaNick PintoEdleno Silva de MouraJoão M. B. CavalcantiJuliana FreirePublished in: CIKM (2006)
Keyphrases
- detection method
- pairwise
- false positive rate
- detection algorithm
- high accuracy
- computationally efficient
- website
- image noise
- highly accurate
- detection rate
- high precision
- edge detection
- similarity measure
- image segmentation
- clustering method
- false positives
- computational cost
- experimental evaluation
- web pages
- web page clustering