Language Specific and Topic Focused Web Crawling.
Olena MedelyanStefan SchulzJan PaetzoldMichael PopratKornél G. MarkóPublished in: LREC (2006)
Keyphrases
- web crawling
- topic specific
- language specific
- focused crawling
- search engine
- language independent
- topic modeling
- natural language
- machine translation
- n gram
- web documents
- web mining
- data mining
- cross lingual
- web data
- labor intensive
- specific features
- deep web
- data mining and machine learning
- probabilistic model
- text retrieval
- news articles
- web search
- domain specific