Precise Detection of Content Reuse in the Web.
Calvin ArdiJohn S. HeidemannPublished in: Comput. Commun. Rev. (2019)
Keyphrases
- web content
- user generated content
- web documents
- web resources
- web information
- content management
- website
- content similarity
- page content
- web portals
- dynamic content
- detection method
- web pages
- automatic detection
- user generated
- anomaly detection
- content and structure
- content creation
- web applications
- rss feeds
- spam detection
- web images
- link analysis
- user interests
- object detection
- detection rate
- learning objects
- user experience
- web mining
- information sources
- semantic web
- false alarms
- metadata
- hyperlink structure
- relevant content
- text content
- user behavior
- linked data
- web technologies
- web data
- false positives
- software reuse
- web queries
- xml documents
- multimedia content
- multimedia