Mining Large-scale Comparable Corpora from Chinese-English News Collections.
Degen HuangLian ZhaoLishuang LiHaitao YuPublished in: COLING (Posters) (2010)
Keyphrases
- chinese english
- comparable corpora
- cross language information retrieval
- news articles
- parallel corpora
- machine translation
- linguistic resources
- query translation
- translation model
- cross language
- bilingual dictionaries
- text documents
- wordnet
- keywords
- out of vocabulary
- text mining
- information retrieval
- social media
- document collections
- data mining
- statistical machine translation
- language independent
- text collections
- machine learning
- language modeling
- query terms
- document retrieval
- knowledge discovery
- labor intensive
- knowledge base
- word pairs
- retrieval model
- language model
- digital libraries