Word-Level Alignment of Paper Documents with their Electronic Full-Text Counterparts.
Mark-Christoph MüllerSucheta GhoshUlrike WittigMaja ReyPublished in: BioNLP@NAACL-HLT (2021)
Keyphrases
- word level
- document analysis
- information retrieval systems
- language independent
- chinese text retrieval
- sentence level
- document images
- retrieval systems
- document level
- machine translation
- character recognition
- word recognition
- information retrieval
- word segmentation
- digital libraries
- source language
- n gram
- image analysis
- document collections
- parallel corpora
- relevant documents
- document retrieval
- sentence pairs
- text documents
- text retrieval
- keywords
- retrieved documents
- user queries
- target language
- web documents
- semantic roles
- metadata
- xml documents
- text analysis
- text mining
- vector space model
- co occurrence
- text classification