Unsupervised comparable corpora preparation and exploration for bi-lingual translation equivalents.
Krzysztof WolkKrzysztof MarasekPublished in: IWSLT (2015)
Keyphrases
- comparable corpora
- cross language information retrieval
- parallel corpora
- machine translation
- news articles
- bilingual lexicon
- word pairs
- text corpora
- language modeling
- bilingual dictionaries
- target language
- query translation
- semi supervised
- cross language
- text documents
- translation model
- unsupervised learning
- statistical machine translation
- topic modeling
- labor intensive
- cross lingual
- information extraction
- bi directional
- tf idf
- parallel corpus
- linguistic resources
- computational linguistics
- novelty detection
- knn
- language independent
- language model