PHINC: A Parallel Hinglish Social Media Code-Mixed Corpus for Machine Translation.
Vivek SrivastavaMayank SinghPublished in: W-NUT@EMNLP (2020)
Keyphrases
- machine translation
- social media
- statistical machine translation
- parallel corpus
- parallel corpora
- machine translation system
- chinese english
- cross lingual
- natural language processing
- pos tagging
- language independent
- cross language information retrieval
- natural language generation
- language processing
- information extraction
- natural language
- target language
- word sense disambiguation
- word alignment
- language resources
- query translation
- training corpus
- text retrieval
- question answering
- user generated content
- multilingual documents
- brazilian portuguese