Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation.
Jakob UszkoreitThorsten BrantsPublished in: ACL (2008)
Keyphrases
- language modeling
- machine translation
- cross lingual
- language model
- n gram
- statistical machine translation
- translation model
- finite state transducers
- language independent
- machine translation system
- word alignment
- word sense disambiguation
- parallel corpus
- word level
- comparable corpora
- information retrieval
- target language
- retrieval model
- word segmentation
- sentence retrieval
- query expansion
- cross language
- probabilistic model
- source language
- cross language information retrieval
- information extraction
- document clustering
- statistical translation models
- natural language processing
- text classification
- co occurrence
- query translation
- finite state
- data mining
- parallel corpora
- information retrieval systems
- feature selection
- ad hoc information retrieval