Information Extraction from the Web by Matching Visual Presentation Patterns.
Radek BurgetPublished in: KEKI/NLP&DBpedia@ISWC (2016)
Keyphrases
- information extraction
- web documents
- web mining
- web information extraction
- website
- textual data
- ontology population
- web applications
- data extraction
- natural language processing
- web pages
- extraction patterns
- web resources
- matching algorithm
- web users
- named entity recognition
- image matching
- web data
- data mining techniques
- text mining
- entity resolution
- multimedia
- web images
- relation extraction
- usage patterns
- semi structured
- web content
- machine learning
- semantic web
- visual features
- linked data
- information retrieval
- linguistic patterns
- named entities
- design patterns
- pattern discovery
- sequential patterns
- visual patterns
- graph matching
- information extraction systems
- low level
- end users
- visual representations
- image classification
- matching process
- data mining
- structured data
- web logs
- conditional random fields
- user generated content
- machine translation