OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles.
Pierre LisonJörg TiedemannPublished in: LREC (2016)
Keyphrases
- parallel corpora
- tv series
- tv broadcast
- machine translation
- cross language information retrieval
- labor intensive
- language independent
- cross lingual
- machine translation system
- comparable corpora
- closed captions
- cross language
- sentence level
- statistical machine translation
- bilingual dictionaries
- word pairs
- wikipedia articles
- fully automated
- information retrieval
- natural language processing
- artificial intelligence