LMDX: Language Model-based Document Information Extraction and Localization.
Vincent PerotKai KangFlorian LuisierGuolong SuXiaoyu SunRamya Sree BoppanaZilong WangZifeng WangJiaqi MuHao ZhangChen-Yu LeeNan HuaPublished in: ACL (Findings) (2024)
Keyphrases
- information extraction
- text documents
- web documents
- unstructured documents
- information retrieval
- natural language
- text mining
- precision and recall
- named entity recognition
- machine translation
- information retrieval systems
- question answering
- text summarization
- source language
- retrieval systems
- target language
- natural language processing
- language learning
- document collections
- conditional random fields
- document images
- programming language
- semi structured
- multilingual documents
- machine learning
- free text
- web mining
- relation extraction
- structured data
- document analysis
- domain specific
- intended meaning
- digital libraries
- database
- ontology based information extraction