Measuring Comparability of Documents in Non-Parallel Corpora for Efficient Extraction of (Semi-)Parallel Translation Equivalents.
Fangzhong SuBogdan BabychPublished in: ESIRMT/HyTra@EACL (2012)
Keyphrases
- parallel corpora
- machine translation
- cross language information retrieval
- comparable corpora
- language independent
- bilingual lexicon
- machine translation system
- labor intensive
- cross lingual
- query translation
- bilingual dictionaries
- word pairs
- parallel texts
- cross language
- information extraction
- statistical machine translation
- sentence level
- target language
- error prone
- multi document summarization
- text documents
- web search
- similarity measure