IndiSentiment140: Sentiment Analysis Dataset for Indian Languages with Emphasis on Low-Resource Languages using Machine Translation.
Saurabh KumarSanasam Ranbir SanasamSukumar NandiPublished in: NAACL-HLT (2024)
Keyphrases
- cross lingual
- machine translation
- indian languages
- sentiment classification
- sentiment analysis
- natural language processing
- cross lingual information retrieval
- language independent
- text classification
- target language
- cross language
- information extraction
- sentence level
- language processing
- statistical machine translation
- language identification
- language modeling
- machine translation system
- chinese english
- cross language information retrieval
- natural language
- parallel corpora
- natural language generation
- text mining
- parallel corpus
- word alignment
- word sense disambiguation
- data mining
- linguistic resources
- word level
- query translation
- wordnet
- bilingual dictionaries
- document images
- information retrieval
- machine learning
- word segmentation
- co occurrence
- clustering algorithm
- source language
- test collection
- vector space
- feature selection