Automatic Extraction of Low Frequency Bilingual Word Pairs from Parallel Corpora with Various Languages.
Hiroshi Echizen-yaKenji ArakiYoshio MomouchiPublished in: PAKDD (2005)
Keyphrases
- automatic extraction
- parallel corpora
- low frequency
- high frequency
- language independent
- cross lingual
- comparable corpora
- machine translation
- frequency domain
- cross language information retrieval
- wavelet transform
- statistical machine translation
- bilingual dictionaries
- labor intensive
- relation extraction
- machine translation system
- cross language
- query translation
- subband
- wavelet coefficients
- sentence pairs
- word pairs
- language modeling
- sentence level
- target language
- wikipedia articles
- text classification
- multiscale
- document retrieval
- n gram
- co occurrence
- multiresolution
- high quality