Login / Signup
Do not crawl in the dust: different urls with similar text.
Ziv Bar-Yossef
Idit Keidar
Uri Schonfeld
Published in:
WWW (2007)
Keyphrases
</>
web pages
web search
web crawler
text documents
website
text mining
web documents
databases
text information
free text
information retrieval
text classification
multimedia
text retrieval
metadata
automatically extracted
text processing
text content
database