Cross-Lingual Text Classification of Transliterated Hindi and Malayalam.
Jitin KrishnanAntonios AnastasopoulosHemant PurohitHuzefa RangwalaPublished in: Big Data (2022)
Keyphrases
- cross lingual
- text classification
- indian languages
- bilingual dictionaries
- language independent
- cross lingual information retrieval
- text categorization
- bag of words
- cross language
- language modeling
- text mining
- machine learning
- labeled data
- text documents
- multi lingual
- parallel corpora
- feature selection
- unlabeled data
- word segmentation
- translation model
- text classifiers
- data mining
- parallel corpus
- knn
- query translation
- knowledge discovery
- k nearest neighbor
- n gram
- machine translation system
- document clustering
- transfer learning