Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination.
Hao FeiQian LiuMeishan ZhangMin ZhangTat-Seng ChuaPublished in: ACL (1) (2023)
Keyphrases
- visual scene
- machine translation
- complex scenes
- single image
- input image
- image data
- vision system
- image collections
- image content
- image features
- image regions
- visual attention
- low level
- visual information
- information extraction
- object recognition
- natural language processing
- cross lingual
- spatial relations
- video sequences
- image classification
- multiple objects
- natural images
- image set
- image matching
- image retrieval
- cross language information retrieval
- image sequences
- statistical machine translation
- multi modal
- target language
- visual features
- machine translation system
- multiple images
- bag of words
- search engine
- text classification
- probabilistic model
- similarity measure
- high level
- image segmentation