Robust language modeling for a small corpus of target tasks using class-combined word statistics and selective use of a general corpus.
Yosuke WadaNorihiko KobayashiTetsunori KobayashiPublished in: Systems and Computers in Japan (2003)
Keyphrases
- language modeling
- language model
- n gram
- multiword
- statistical machine translation
- query expansion
- translation model
- retrieval model
- cross lingual
- parallel corpus
- comparable corpora
- term weighting
- document level
- co occurrence
- statistical language modeling
- word pairs
- word segmentation
- improvements in retrieval effectiveness
- sentence level
- information retrieval
- language independent
- probabilistic model