Out of Sight, Out of Mind: Detecting Orphaned Web Pages at Internet-Scale.
Stijn PletinckxKevin BorgolteTobias FiebigPublished in: CCS (2021)
Keyphrases
- web pages
- website
- artificial intelligence
- search engine
- web search
- web documents
- dynamically generated
- web page classification
- web data
- web content
- web server
- information retrieval
- link analysis
- link structure
- web information extraction
- web spam detection
- random walk
- web resources
- textual content
- social bookmarking
- information retrieval systems
- social networks
- google search engine
- web content mining