Data Selection for Unsupervised Translation of German-Upper Sorbian.
Lukas EdmanAntonio ToralGertjan van NoordPublished in: WMT@EMNLP (2020)
Keyphrases
- data sets
- database
- raw data
- data collection
- high quality
- data quality
- data processing
- data sources
- knowledge discovery
- data acquisition
- high dimensional data
- input data
- statistical methods
- data analysis
- data structure
- databases
- complex data
- statistical analysis
- small number
- semi supervised
- data streams
- training data
- neural network