Improving Information Extraction from Visually Rich Documents using Visual Span Representations.
Ritesh SarkhelArnab NandiPublished in: Proc. VLDB Endow. (2021)
Keyphrases
- information extraction
- free text
- text documents
- web documents
- information retrieval
- unstructured documents
- natural language text
- visual representations
- visual information
- document collections
- question answering
- document classification
- textual data
- document retrieval
- text mining
- information retrieval systems
- precision and recall
- mid level
- visual representation
- unstructured text
- natural language processing
- keywords
- relation extraction
- information extraction systems
- machine learning
- xml documents
- high level
- semantic content
- low level
- conditional random fields
- web mining
- semi structured
- named entity recognition
- textual information
- machine translation
- metadata
- relevant documents
- multimedia documents
- visual stimuli
- visual features
- human observers
- document clustering
- search engine
- document analysis
- user queries
- data extraction
- query terms
- vector space model