Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation.
Tahmid HasanAbhik BhattacharjeeKazi SaminMasum HasanMadhusudan BasakM. Sohel RahmanRifat ShahriyarPublished in: CoRR (2020)
Keyphrases
- machine translation
- statistical machine translation
- target language
- machine translation system
- word alignment
- language independent
- cross lingual
- language processing
- cross language information retrieval
- natural language processing
- brazilian portuguese
- natural language
- natural language generation
- chinese english
- word sense disambiguation
- information extraction
- language resources
- source language
- parallel corpus
- language specific
- named entity recognition
- information retrieval
- parallel corpora
- machine learning
- phrase based smt
- data mining