Vision talks: Visual relationship-enhanced transformer for video-guided machine translation.
Shiyu ChenYawen ZengDa CaoShaofei LuPublished in: Expert Syst. Appl. (2022)
Keyphrases
- machine translation
- visual perception
- language independent
- cross lingual
- word sense disambiguation
- natural language processing
- language processing
- information extraction
- language resources
- target language
- video data
- video content
- video search
- word alignment
- multimedia
- cross language information retrieval
- natural language generation
- brazilian portuguese
- chinese english
- vision system
- statistical machine translation
- machine translation system
- video sequences
- parallel corpora
- query translation
- visual information
- word level
- machine transliteration
- visual features
- translation model
- statistical translation models