Gender Prediction in English-Hindi Code-Mixed Social Media Content: Corpus and Baseline System.
Ankush KhandelwalSahil SwamiSyed Sarfaraz AkhtarManish ShrivastavaPublished in: Computación y Sistemas (2018)
Keyphrases
- statistical machine translation
- machine translation
- social media content
- link grammar
- security informatics
- language identification
- natural language
- open domain
- person names
- comparable corpora
- proper names
- contextual features
- sentiment analysis
- internet usage
- parallel corpus
- social media
- multiword
- spoken language
- named entity recognition
- cross lingual
- target language
- parallel corpora
- indian languages
- optical character recognition
- penn treebank
- query translation
- user generated content
- machine translation system
- word sense
- data mining
- cross language information retrieval
- computational methods
- language model
- co occurrence
- natural language processing
- knowledge representation
- data analysis
- knowledge base
- social networks
- machine learning