Word-Level Alignment of Paper Documents with their Electronic Full-Text Counterparts.
Mark-Christoph MüllerSucheta GhoshUlrike WittigMaja ReyPublished in: CoRR (2021)
Keyphrases
- word level
- document analysis
- information retrieval systems
- language independent
- chinese text retrieval
- sentence level
- retrieval systems
- document level
- document images
- document collections
- n gram
- information retrieval
- character recognition
- machine translation
- digital libraries
- word recognition
- source language
- relevant documents
- image analysis
- sentence pairs
- document retrieval
- xml documents
- document clustering
- text retrieval
- word segmentation
- query expansion
- viterbi algorithm
- sentiment analysis
- text analysis
- sentiment classification
- web documents
- keywords
- multi document summarization
- cross language
- test collection
- semantic information
- language model
- metadata