Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation.
Tahmid HasanAbhik BhattacharjeeKazi SaminMasum HasanMadhusudan BasakM. Sohel RahmanRifat ShahriyarPublished in: EMNLP (1) (2020)
Keyphrases
- machine translation
- statistical machine translation
- target language
- cross lingual
- word alignment
- natural language processing
- information extraction
- machine translation system
- language processing
- brazilian portuguese
- cross language information retrieval
- natural language generation
- natural language
- chinese english
- word sense disambiguation
- language independent
- language resources
- source language
- query translation
- named entities
- parallel corpora
- parallel corpus
- language specific
- machine transliteration
- phrase based smt
- word level
- information retrieval
- machine readable dictionaries
- bilingual dictionaries
- cross language