NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages.
Genta Indra WinataAlham Fikri AjiSamuel CahyawijayaRahmad MahendraFajri KotoAde RomadhonyKemal KurniawanDavid MoeljadiRadityo Eko PrasojoPascale FungPublished in: EACL (2023)
Keyphrases
- language resources
- cross lingual
- language independent
- machine translation
- multi lingual
- sentiment classification
- multilingual information retrieval
- language specific
- multilingual documents
- sentiment analysis
- parallel implementation
- cross language information retrieval
- opinion mining
- parallel processing
- cross lingual information retrieval
- broadcast news
- query translation
- metadata
- digital libraries
- shared memory
- parallel corpora
- text classification
- indian languages
- comparable corpora
- linguistic resources
- database
- expressive power
- chinese english
- text mining
- transfer learning
- cross language
- parallel computing
- n gram