Detecting duplicate web documents using clickthrough data.
Filip RadlinskiPaul N. BennettEmine YilmazPublished in: WSDM (2011)
Keyphrases
- web documents
- clickthrough data
- web search engines
- web search
- web pages
- query logs
- information extraction
- search engine
- semi structured
- keywords
- vector space model
- ranking functions
- log data
- relevance judgments
- web data
- implicit feedback
- n gram
- search queries
- query suggestion
- language model
- user queries
- relevance ranking
- structured documents
- search result
- document representation
- link structure
- eye tracking
- user behavior