Magic Markup: Maintaining Document-External Markup with an LLM.
Edward MisbackZachary TatlockSteven L. TanimotoPublished in: CoRR (2024)
Keyphrases
- document structure
- markup language
- information retrieval
- document images
- databases
- textual content
- internal and external
- web search
- document collections
- retrieval systems
- content and structure
- structured documents
- document representation
- document classification
- xml schema
- document retrieval
- co occurrence
- information extraction
- digital libraries
- keywords
- database systems
- machine learning