Stable web scraping: an approach based on neighbour zone and path similarity of page elements.
Peng GaoHao HanJunxia GuoMotoshi SaekiPublished in: Int. J. Web Eng. Technol. (2018)
Keyphrases
- website
- web pages
- web applications
- similarity measure
- page content
- content similarity
- web users
- page importance
- link analysis
- web resources
- traversal patterns
- web content
- search engine
- web graph
- anchor text
- web log mining
- hyperlink structure
- web browsing
- distance measure
- web documents
- web technologies
- information sources
- semantic web
- semantic similarity
- pagerank algorithm
- content features
- web search
- similarity function
- end users
- home page
- keywords
- web sessions
- shortest path