PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation.
Long DoanLinh The NguyenNguyen Luong TranThai HoangDat Quoc NguyenPublished in: EMNLP (1) (2021)
Keyphrases
- machine translation
- benchmark datasets
- cross lingual
- target language
- language independent
- natural language processing
- brazilian portuguese
- cross language information retrieval
- statistical machine translation
- language processing
- machine translation system
- information extraction
- natural language
- language resources
- word sense disambiguation
- chinese english
- parallel corpora
- natural language generation
- word level
- source language
- machine learning
- word alignment
- machine transliteration
- information retrieval
- language specific
- comparable corpora
- query translation
- english chinese
- multilingual documents