Probabilistic models for focused web crawling.
Hongyu LiuEvangelos E. MiliosJeannette C. M. JanssenPublished in: WIDM (2004)
Keyphrases
- web crawling
- probabilistic model
- web mining
- search engine
- topic specific
- web data
- data mining and machine learning
- expectation maximization
- generative model
- bayesian networks
- data mining
- text categorization
- deep web
- link analysis
- latent variables
- topic models
- natural language processing
- database
- machine learning
- databases