Chuweb21D: A Deduped English Document Collection for Web Search Tasks.
Zhumin ChuTetsuya SakaiQingyao AiYiqun LiuPublished in: SIGIR-AP (2023)
Keyphrases
- search tasks
- document collections
- test collection
- cross language
- trec web
- search interface
- scatter gather
- document retrieval
- information retrieval
- exploratory search
- search result
- relevant documents
- anchor text
- information retrieval systems
- retrieval model
- retrieval effectiveness
- user search behavior
- text retrieval
- document representation
- document clustering
- information seeking
- cl sr
- digital libraries
- web documents
- web users
- search sessions
- database
- web content
- web pages
- spoken document retrieval
- natural language
- natural language processing
- language model
- retrieval systems
- vector space model
- language modeling
- machine translation
- search engine