Separating Grains from the Chaff: Using Data Filtering to Improve Multilingual Translation for Low-Resourced African Languages.
Idris AbdulmuminMichael BeukmanJesujoba O. AlabiChris Chinenye EmezueEverlyn ChimotoTosin P. AdewumiShamsuddeen Hassan MuhammadMofetoluwa AdeyemiOreen YousufSahib SinghTajuddeen GwadabePublished in: WMT (2022)
Keyphrases
- data sets
- data analysis
- data processing
- database
- high quality
- image data
- computer systems
- data structure
- data collection
- training data
- probability distribution
- knowledge discovery
- original data
- data quality
- highly correlated
- machine translation system
- data sources
- synthetic data
- spatial data
- expressive power
- machine translation