Research on New Algorithm of Topic-Oriented Crawler and Duplicated Web Pages Detection.
Yong-Heng ZhangFeng ZhangPublished in: ICIC (2) (2012)
Keyphrases
- detection algorithm
- web pages
- learning algorithm
- website
- experimental evaluation
- preprocessing
- objective function
- dynamic programming
- optimization algorithm
- computational cost
- cost function
- search space
- computational complexity
- neural network
- probabilistic model
- significant improvement
- worst case
- bayesian networks
- matching algorithm
- convergence rate
- false alarms
- hits algorithm