BigVideo: A Large-scale Video Subtitle Translation Dataset for Multimodal Machine Translation.
Liyan KangLuyang HuangNingxin PengPeihao ZhuZewei SunShanbo ChengMingxuan WangDegen HuangJinsong SuPublished in: CoRR (2023)
Keyphrases
- machine translation
- multimedia
- target language
- language independent
- cross lingual
- natural language processing
- statistical machine translation
- cross language information retrieval
- video data
- machine translation system
- chinese english
- language processing
- information extraction
- language resources
- query translation
- word alignment
- parallel corpora
- natural language generation
- bilingual dictionaries
- video content
- natural language
- brazilian portuguese
- video retrieval
- word sense disambiguation
- source language
- mt evaluation
- word level
- video search
- co occurrence
- cross lingual information retrieval
- finite state transducers
- multilingual documents
- machine learning
- data mining