Predictive Crawling for Commercial Web Content.
Shuguang HanBernhard BrodowskyPrzemek GajdaSergey NovikovMike BenderskyMarc NajorkRobin DuaAlexandrin PopesculPublished in: WWW (2019)
Keyphrases
- web content
- web pages
- website
- user generated
- web data
- web documents
- web users
- search engine
- web crawling
- web resources
- web mining
- resource discovery
- web information
- web search engines
- web search
- web browsing
- html documents
- semantic browsing
- focused crawling
- social media
- information extraction
- web server
- link analysis
- user interests
- text content
- text classification
- focused crawler