The Word Is Mightier Than the Count: Accumulating Translation Resources from Parsed Parallel Corpora.
Stephen NightingaleHideki TanakaPublished in: CICLing (2003)
Keyphrases
- parallel corpora
- word pairs
- machine translation system
- statistical machine translation
- bilingual dictionaries
- english chinese
- machine translation
- sentence level
- out of vocabulary
- text corpora
- cross language information retrieval
- language independent
- comparable corpora
- sentence pairs
- parallel corpus
- cross lingual
- parallel texts
- translation model
- semantic relations
- language resources
- topic models
- query translation
- cross language
- labor intensive
- chinese english
- linguistic resources
- semantic similarity
- word alignment
- source language
- word level
- n gram
- word sense disambiguation
- target language
- sentiment classification
- co occurrence
- sentiment analysis
- training corpus
- tf idf
- text mining
- information retrieval