Ncode: an Open Source Bilingual N-gram SMT Toolkit.
Josep Maria CregoFrançois YvonJosé B. MariñoPublished in: Prague Bull. Math. Linguistics (2011)
Keyphrases
- n gram
- open source
- statistical machine translation
- language model
- word alignment
- machine translation
- chinese english
- cross lingual
- language modeling
- language independent
- language modelling
- variable length
- machine translation system
- parallel corpora
- text classification
- part of speech
- parallel corpus
- bag of words
- translation model
- character n grams
- multiword
- out of vocabulary
- cross language information retrieval
- finite state transducers
- word segmentation
- viterbi algorithm
- web documents
- document retrieval
- source language
- information retrieval
- query expansion
- feature selection
- word level
- test collection
- information extraction
- data mining
- inside outside algorithm