PheMT: A Phenomenon-wise Dataset for Machine Translation Robustness on User-Generated Contents.
Ryo FujiiMasato MitaKaori AbeKazuaki HanawaMakoto MorishitaJun SuzukiKentaro InuiPublished in: CoRR (2020)
Keyphrases
- bilingual dictionaries
- machine translation
- user generated content
- cross language information retrieval
- parallel corpora
- cross lingual
- natural language processing
- query translation
- machine translation system
- information extraction
- language independent
- target language
- language processing
- natural language
- natural language generation
- statistical machine translation
- social media
- chinese english
- pairwise
- language resources
- word alignment
- artificial intelligence
- word sense disambiguation
- source language
- expert systems
- machine learning
- tasks in natural language processing