Creating Permanent Test Collections of Web Pages for Information Extraction Research.
Bernhard PollakWolfgang GatterbauerPublished in: SOFSEM (2) (2007)
Keyphrases
- test collection
- information extraction
- web pages
- information retrieval
- web documents
- search engine
- anchor text
- data extraction
- retrieval effectiveness
- retrieval model
- document collections
- relevant documents
- natural language processing
- language model
- ir evaluation
- retrieval systems
- average precision
- relevance assessments
- text mining
- evaluation methodology
- evaluation campaigns
- question answering
- web search
- named entities
- information retrieval evaluation
- structured data
- precision and recall
- named entity recognition
- web search engines
- ad hoc retrieval
- relevance judgments
- newspaper articles
- patent retrieval
- keywords
- clef evaluation campaign
- machine learning
- machine translation
- text summarization
- link analysis
- relevance judgements
- web mining
- information retrieval systems
- text documents
- xml retrieval
- document set
- evaluation of information retrieval systems