Vashantor: A Large-scale Multilingual Benchmark Dataset for Automated Translation of Bangla Regional Dialects to Bangla Language.
Fatema Tuj Johora FariaMukaffi Bin MoinAhmed Al WaseMehidi AhmmedMd. Rabius SaniTashreef MuhammadPublished in: CoRR (2023)
Keyphrases
- benchmark datasets
- indian languages
- statistical machine translation
- parallel corpus
- machine translation system
- machine translation
- cross lingual information retrieval
- cross lingual
- language resources
- cross language information retrieval
- language specific
- test set
- bilingual dictionaries
- source language
- chinese english
- parallel corpora
- document images
- target language
- comparable corpora
- word alignment
- scene images
- multilingual documents
- language identification
- handwritten documents
- cross language
- translation model
- character segmentation
- outdoor images
- real world
- handwritten numerals
- language independent
- linguistic resources
- character recognition
- pedestrian detection
- pose estimation
- recognition rate
- language model
- information extraction
- face recognition
- language modeling
- multiword