C-BiLDA extracting cross-lingual topics from non-parallel texts by distinguishing shared from unshared content.
Geert HeymanIvan VulicMarie-Francine MoensPublished in: Data Min. Knowl. Discov. (2016)
Keyphrases
- cross lingual
- parallel corpora
- parallel corpus
- machine translation system
- machine translation
- statistical machine translation
- cross language
- language modeling
- language independent
- cross language information retrieval
- text classification
- translation model
- wikipedia articles
- sentiment classification
- bilingual dictionaries
- information retrieval
- news articles
- machine learning
- language model
- latent dirichlet allocation
- text documents
- topic modeling
- labor intensive
- knowledge base