Enhancing Document Information Analysis with Multi-Task Pre-training: A Robust Approach for Information Extraction in Visually-Rich Documents.
Tofik AliPartha Pratim RoyPublished in: CoRR (2023)
Keyphrases
- information extraction
- web documents
- information retrieval
- free text
- keywords
- multi task
- electronic documents
- text documents
- document collections
- information retrieval systems
- unstructured documents
- digital documents
- document content
- text mining
- prior knowledge
- document classification
- data analysis
- collaborative filtering
- natural language processing
- document clustering
- multi class
- training set
- feature selection