NL sampler: random sampling of web documents based on natural language with query hit estimation.
Daniel SchusterAlexander SchillPublished in: SAC (2007)
Keyphrases
- web documents
- random sampling
- natural language
- information extraction
- natural language questions
- keywords
- active learning
- sampling procedure
- sampling algorithm
- semi structured
- web pages
- question answering
- web search engines
- sample size
- natural language interface
- natural language processing
- machine learning
- query processing
- html documents
- random samples
- sliding window
- related web pages
- relevance feedback
- written in natural language
- search queries
- database
- link structure
- social annotations
- random sample
- data structure
- returned by a search engine
- monte carlo
- data sources
- data mining
- semistructured data
- web logs
- robust estimation
- structured data
- markov chain
- information retrieval systems
- e learning