ATHAR: A High-Quality and Diverse Dataset for Classical Arabic to English Translation.
Mohammed KhalilMohammed SabryPublished in: CoRR (2024)
Keyphrases
- high quality
- machine translation
- arabic language
- query translation
- language identification
- cross language information retrieval
- statistical machine translation
- cross language
- language resources
- cross language retrieval
- bilingual dictionaries
- target language
- parallel corpora
- cross language ir
- parallel corpus
- english language
- machine translation system
- real world
- low quality
- synthetic datasets
- language learning
- pronominal anaphora
- cross lingual
- source language
- comparable corpora
- chinese english
- word alignment
- cross lingual information retrieval
- english chinese
- morphological analysis
- language processing
- text to speech
- english words
- image quality
- sentence pairs
- broadcast news