Measuring sentence parallelism using Mahalanobis distances: The NRC unsupervised submissions to the WMT18 Parallel Corpus Filtering shared task.
Patrick LittellSamuel LarkinDarlene A. StewartMichel SimardCyril GoutteChi-kiu LoPublished in: WMT (shared task) (2018)
Keyphrases
- parallel corpus
- distance function
- euclidean distance
- cross lingual
- distance measure
- sentence pairs
- language independent
- cross language information retrieval
- query translation
- word alignment
- machine translation
- semantic role labeling
- machine translation system
- target language
- latent semantic analysis
- semi supervised
- supervised learning
- statistical machine translation
- document clustering
- data points
- source language
- semantic space
- knn