PheMT: A Phenomenon-wise Dataset for Machine Translation Robustness on User-Generated Contents.
Ryo FujiiMasato MitaKaori AbeKazuaki HanawaMakoto MorishitaJun SuzukiKentaro InuiPublished in: COLING (2020)
Keyphrases
- machine translation
- user generated content
- cross lingual
- information extraction
- natural language processing
- language independent
- language processing
- cross language information retrieval
- natural language generation
- target language
- word sense disambiguation
- pairwise
- word alignment
- machine translation system
- multilingual documents
- statistical machine translation
- natural language
- machine transliteration
- statistical translation models
- parallel corpora
- recommender systems
- chinese english
- language resources
- grammar induction
- social media
- brazilian portuguese
- expert systems