Don't Forget Cheap Training Signals Before Building Unsupervised Bilingual Word Embeddings.
Silvia SeveriniViktor HangyaMasoud Jalili SabetAlexander FraserHinrich SchützePublished in: CoRR (2022)
Keyphrases
- supervised learning
- supervised training
- word alignment
- unsupervised learning
- multiword
- signal processing
- training corpus
- word pairs
- parallel corpus
- n gram
- machine translation
- bilingual dictionaries
- machine translation system
- word sense disambiguation
- manifold learning
- english chinese
- co occurrence
- semi supervised
- training set
- sentence pairs
- supervised methods
- neural network
- bilingual lexicon
- statistical machine translation
- word segmentation
- low dimensional
- dimensionality reduction
- low cost
- information retrieval
- machine learning