A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection.
Sahil SwamiAnkush KhandelwalVinay SinghSyed Sarfaraz AkhtarManish ShrivastavaPublished in: CoRR (2018)
Keyphrases
- statistical machine translation
- machine translation
- person names
- link grammar
- named entities
- language identification
- comparable corpora
- detection method
- open domain
- proper names
- detection algorithm
- parallel corpus
- social media
- contextual features
- named entity recognition
- object detection
- false positives
- english language
- english words
- indian languages
- broad coverage
- spoken language
- parallel corpora
- text corpora
- wide coverage
- cross lingual
- training corpus
- cross language information retrieval
- language learning
- machine translation system
- semantic roles
- query translation
- topic tracking
- natural language processing
- multiword
- target language
- natural language