Information Extraction from Visually Rich Documents with Font Style Embeddings.
Ismail OussaidWilliam VanhuffelPirashanth RatnamoganMhamed HajaiejAlexis MatheyThomas GillesPublished in: ICPR (2022)
Keyphrases
- information extraction
- free text
- web documents
- text documents
- information retrieval
- unstructured documents
- natural language text
- textual data
- text mining
- document collections
- vector space
- character recognition
- natural language processing
- precision and recall
- question answering
- information retrieval systems
- document clustering
- document retrieval
- unstructured text
- structured data
- document classification
- document analysis
- named entities
- machine learning
- semi structured
- relevant documents
- retrieval systems
- textual information
- authorship attribution
- vector space model
- dimensionality reduction
- relation extraction
- optical character recognition
- ocr systems
- document image understanding
- text processing
- named entity recognition
- low dimensional
- multi document summarization
- data extraction
- distance measure
- web mining
- document images
- xml documents
- user queries