Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration.
Sunhao DaiWeihao LiuYuqi ZhouLiang PangRongju RuanGang WangZhenhua DongJun XuJi-Rong WenPublished in: CoRR (2024)
Keyphrases
- information retrieval
- document collections
- information retrieval systems
- relevant documents
- document retrieval
- vector space model
- retrieval systems
- learning to rank
- query terms
- language model
- retrieved documents
- test collection
- automatic categorization
- query expansion
- text collections
- distributed information retrieval
- information extraction
- text documents
- text mining
- maximal marginal relevance
- document indexing
- text retrieval
- retrieval effectiveness
- real world
- document categorization
- information access
- information retrieval evaluation
- structured documents
- text categorization
- question answering
- language modeling
- information seeking
- effective retrieval
- relevance feedback
- vector space
- document search
- retrieval model
- keywords
- boolean queries
- ranked list
- metadata
- document ranking
- retrieval strategies
- term weighting
- latent semantic indexing
- document classification
- document representation