Publication: NL sampler: random sampling of web documents based on natural language with query hit estimation.