CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered.
Ondrej BojarOndrej DusekTom KocmiJindrich LibovickýMichal NovákMartin PopelRoman SudarikovDusan VarisPublished in: TSD (2016)
Keyphrases
- parallel corpus
- language independent
- cross lingual
- machine translation
- cross language
- cross language information retrieval
- machine translation system
- word alignment
- query translation
- statistical machine translation
- sentence pairs
- parallel corpora
- n gram
- target language
- information retrieval
- language modeling
- text classification
- latent semantic analysis
- source language
- document clustering
- co occurrence
- machine learning
- cl sr