Login / Signup
Web Document Duplicate Detection Using Fuzzy Hashing.
Carlos G. Figuerola
Raquel Gómez Díaz
José Luis Alonso Berrocal
Ángel F. Zazo Rodríguez
Published in:
PAAMS (Workshops) (2011)
Keyphrases
</>
duplicate detection
web documents
web pages
semi structured
record linkage
web content
keywords
graph search
information extraction
prefetching
web logs
dynamically generated
data cleaning
unstructured documents
web search engines
knowledge discovery
website
data sets
databases