Effective Parallel Corpus Mining using Bilingual Sentence Embeddings.
Mandy GuoQinlan ShenYinfei YangHeming GeDaniel CerGustavo Hernández ÁbregoKeith StevensNoah ConstantYun-Hsuan SungBrian StropeRay KurzweilPublished in: WMT (2018)
Keyphrases
- parallel corpus
- cross lingual
- sentence pairs
- cross language information retrieval
- language independent
- machine translation
- query translation
- machine translation system
- target language
- lexical knowledge
- word alignment
- parallel texts
- statistical machine translation
- text mining
- source language
- parallel corpora
- search engine
- dimensionality reduction
- digital libraries
- information retrieval
- co occurrence
- language modeling
- text categorization