A Span Extraction Approach for Information Extraction on Visually-Rich Documents.
Tuan-Anh D. NguyenHieu M. VuNguyen Hong SonMinh-Tien NguyenPublished in: CoRR (2021)
Keyphrases
- information extraction
- text documents
- free text
- web documents
- information retrieval
- unstructured documents
- text analysis
- textual data
- natural language text
- unstructured text
- precision and recall
- text mining
- natural language processing
- extraction rules
- document collections
- structured data
- relation extraction
- document classification
- information extraction systems
- document clustering
- semi structured
- named entity recognition
- conditional random fields
- named entities
- machine learning
- web information extraction
- data extraction
- text processing
- news articles
- textual content
- web mining
- extraction patterns
- machine translation
- high level
- relational learning
- relevant documents
- document representation
- retrieval systems
- word sense disambiguation
- document retrieval
- document analysis
- keywords
- semantic information
- electronic documents
- xml documents
- entity extraction
- wordnet
- wrapper induction