Fixing Rogue Memorization in Many-to-One Multilingual Translators of Extremely-Low-Resource Languages by Rephrasing Training Samples.
Paulo CavalinPedro Henrique DominguesClaudio S. PinhanezJulio NogimaPublished in: NAACL-HLT (2024)
Keyphrases
- training samples
- language resources
- language independent
- machine translation
- cross lingual
- multi lingual
- multilingual documents
- machine translation system
- comparable corpora
- feature space
- multilingual information retrieval
- training set
- language specific
- training data
- hyperplane
- number of training samples
- supervised learning
- test sample
- learning algorithm
- decision boundary
- parallel corpora
- high dimensional
- cross language information retrieval
- sample set
- face images
- query translation
- cross language
- decision function
- machine learning
- discriminative information
- text classification
- multi class
- feature vectors
- objective function
- decision trees