APE-then-QE: Correcting then Filtering Pseudo Parallel Corpora for MT Training Data Creation.
Akshay BathejaSourabh Dattatray DeoghareDiptesh KanojiaPushpak BhattacharyyaPublished in: CoRR (2023)
Keyphrases
- parallel corpora
- machine translation
- training data
- query translation
- query expansion
- language independent
- cross language information retrieval
- cross lingual
- machine translation system
- information extraction
- target language
- decision trees
- statistical machine translation
- natural language
- learning algorithm
- natural language processing
- training set
- labor intensive
- cross language
- retrieval effectiveness
- wikipedia articles
- test collection
- n gram
- labeled data