Sampling Imbalanced Data for Multilingual Machine Translation: An Overview of Techniques.
Albina KhusainovaPublished in: ISDA (4) (2022)
Keyphrases
- imbalanced data
- machine translation
- cross lingual
- language independent
- language resources
- cross language information retrieval
- multilingual documents
- chinese english
- sampling methods
- machine translation system
- natural language processing
- linear regression
- class distribution
- classification models
- information extraction
- parallel corpus
- natural language
- cross language
- support vector machine
- ensemble methods
- class imbalance
- query translation
- target language
- feature selection
- information retrieval
- decision trees
- statistical machine translation
- minority class
- word alignment
- random forest
- translation model
- sampling algorithm
- ensemble classifier
- svm classifier
- markov chain