The Multilingual Microblog Translation Corpus: Improving and Evaluating Translation of User-Generated Text.
Paul McNameeKevin DuhPublished in: LREC (2022)
Keyphrases
- machine translation system
- user generated
- parallel corpus
- machine translation
- english words
- statistical machine translation
- cross language information retrieval
- chinese english
- social media
- comparable corpora
- query translation
- cross lingual
- training corpus
- cross language
- text content
- translation model
- parallel corpora
- website
- text data
- multiword
- text corpora
- text documents
- information retrieval systems
- social networks