Quantifying test collection quality based on the consistency of relevance judgements.
Falk ScholerAndrew TurpinMark SandersonPublished in: SIGIR (2011)
Keyphrases
- relevance judgements
- test collection
- relevance assessments
- information retrieval
- relevance feedback
- retrieval effectiveness
- retrieval systems
- retrieval model
- relevant documents
- language model
- average precision
- evaluation methodology
- high quality
- relevance judgments
- search tasks
- document collections
- databases
- search engine
- gold standard
- chinese web
- trec web track
- ir evaluation
- evaluation of information retrieval systems
- ad hoc retrieval
- learning to rank
- precision and recall
- information retrieval systems