KC4MT: A High-Quality Corpus for Multilingual Machine Translation.
Vinh Van NguyenHa NguyenHuong Thanh LeThai Phuong NguyenTan Van BuiLuan-Nghia PhamAnh Tuan PhanCong Hoang-Minh NguyenViet-Hong TranAnh Huu TranPublished in: LREC (2022)
Keyphrases
- machine translation
- parallel corpus
- chinese english
- cross lingual
- machine translation system
- statistical machine translation
- language resources
- cross language information retrieval
- comparable corpora
- language independent
- parallel corpora
- language specific
- mono lingual
- multilingual documents
- natural language processing
- cross lingual information retrieval
- language processing
- pos tagging
- word sense disambiguation
- target language
- information extraction
- word alignment
- natural language
- cross language
- natural language generation
- word sense
- query translation
- word level
- source language
- multilingual information retrieval
- bilingual lexicon
- brazilian portuguese
- text corpora
- sentiment classification
- finite state transducers
- lexical knowledge
- machine learning