UMTIT: Unifying Recognition, Translation, and Generation for Multimodal Text Image Translation.
Liqiang NiuFandong MengJie ZhouPublished in: LREC/COLING (2024)
Keyphrases
- image data
- input image
- image matching
- image content
- image classification
- single image
- machine translation
- recognition rate
- image analysis
- image segmentation
- template matching
- image retrieval
- machine translation system
- feature extraction
- test images
- query translation
- connected component analysis
- image representation
- image features
- text retrieval
- object recognition
- high resolution
- cross language information retrieval
- partial occlusion
- image collections
- handwritten words
- low level
- scanned documents
- segmentation algorithm
- web images
- segmentation method
- multiscale
- action recognition
- printed documents
- statistical machine translation
- object models
- recognition algorithm
- multi modal
- image regions