Exploiting Cross-Lingual Subword Similarities in Low-Resource Document Classification.
Mozhi ZhangYoshinari FujinumaJordan L. Boyd-GraberPublished in: CoRR (2018)
Keyphrases
- document classification
- cross lingual
- text classification
- n gram
- word alignment
- text mining
- machine translation
- text documents
- text categorization
- language modeling
- cross language
- document clustering
- machine learning
- naive bayes
- similarity measure
- classification algorithm
- feature selection
- bag of words
- transfer learning
- news articles
- web documents
- labeled data
- high dimensional
- knowledge discovery
- probabilistic model
- machine translation system
- data points
- search engine