Creating and Managing a large annotated parallel corpora of Indian languages.
Ritesh KumarShiv Bhusan KaushikPinkey NainwaniGirish Nath JhaPublished in: CoRR (2021)
Keyphrases
- parallel corpora
- cross lingual
- indian languages
- cross lingual information retrieval
- machine translation
- language independent
- cross language information retrieval
- comparable corpora
- cross language
- statistical machine translation
- document images
- machine translation system
- labor intensive
- language modeling
- text classification
- word segmentation
- word pairs
- language identification
- query translation
- wikipedia articles
- bilingual dictionaries
- translation model
- sentence level