Bilingual Text Classification using the IBM 1 Translation Model.
Jorge CiveraAlfons Juan-CíscarPublished in: LREC (2008)
Keyphrases
- text classification
- translation model
- cross lingual
- chinese english
- language modeling
- cross language
- text categorization
- statistical machine translation
- cross language information retrieval
- language independent
- feature selection
- language model
- word alignment
- text mining
- n gram
- parallel corpus
- bag of words
- comparable corpora
- text documents
- machine learning
- bilingual dictionaries
- parallel corpora
- labeled data
- query translation
- multi label
- knn
- text classifiers
- statistical translation models
- machine translation system
- semantic features
- co occurrence
- unlabeled data