PHINC: A Parallel Hinglish Social Media Code-Mixed Corpus for Machine Translation.
Vivek SrivastavaMayank SinghPublished in: CoRR (2020)
Keyphrases
- machine translation
- social media
- statistical machine translation
- chinese english
- parallel corpora
- machine translation system
- parallel corpus
- cross lingual
- natural language processing
- pos tagging
- target language
- language independent
- cross language information retrieval
- language processing
- information extraction
- word alignment
- language resources
- natural language generation
- comparable corpora
- training corpus
- natural language
- brazilian portuguese
- mono lingual
- word level
- word sense
- query translation
- user generated content
- word sense disambiguation
- sentiment classification
- co occurrence