A Dataset for Content Error Detection in Web Archives.
Johannes KieselFabienne HubrichtBenno SteinMartin PotthastPublished in: JCDL (2019)
Keyphrases
- error detection
- web content
- error correction
- web resources
- user generated content
- website
- web documents
- metadata
- digital archives
- multimedia
- rss feeds
- error recovery
- data cleansing
- dynamic content
- content management
- web information
- fault isolation
- content and structure
- user experience
- web applications
- error correcting
- photo collections
- fault tolerance
- web mining
- web data
- relevant content
- content creation
- online communities
- page content
- digital libraries
- user interests
- content similarity
- web images
- web portals
- online resources
- intelligent systems
- end users
- web users
- cultural heritage
- web pages
- multi agent
- social networking sites