Document Attrition in Web Corpora: an Exploration.
Stephen WattamPaul RaysonDamon BerridgePublished in: LREC (2012)
Keyphrases
- web corpora
- query expansion
- relevant documents
- text documents
- query translation
- information retrieval
- keywords
- web documents
- information retrieval systems
- document classification
- retrieval systems
- semantic information
- document representation
- document clustering
- document retrieval
- document collections
- search engine
- web search
- information extraction