ChainMR Crawler: A Distributed Vertical Crawler Based on MapReduce.
Xixia LiuZhengping JinPublished in: SpaCCS Workshops (2016)
Keyphrases
- search engine
- website
- web pages
- web crawler
- distributed systems
- focused crawling
- distributed processing
- distributed computing
- focused crawler
- web crawling
- topic specific
- communication cost
- cooperative
- distributed environment
- peer to peer
- data intensive
- communication overhead
- mobile agents
- web crawlers
- web users
- parallel processing
- fault tolerant
- map reduce
- web search engines
- learning algorithm