SUMAT: Data Collection and Parallel Corpus Compilation for Machine Translation of Subtitles.
Volha PetukhovaRodrigo AgerriMark FishelSergio PenkaleArantza del PozoMirjam Sepesy MaucecAndy WayPanayota GeorgakopoulouMartin VolkPublished in: LREC (2012)
Keyphrases
- parallel corpus
- machine translation
- cross lingual
- machine translation system
- language independent
- word alignment
- cross language information retrieval
- target language
- statistical machine translation
- query translation
- natural language processing
- parallel corpora
- source language
- information extraction
- natural language
- cross lingual information retrieval
- word sense disambiguation
- text mining
- finite state transducers
- word level
- knn