Multilingual Seq2seq Training with Similarity Loss for Cross-Lingual Document Classification.
Katherine YuHaoran LiBarlas OguzPublished in: Rep4NLP@ACL (2018)
Keyphrases
- cross lingual
- document classification
- text classification
- cross lingual information retrieval
- machine translation
- cross language
- word alignment
- text categorization
- language modeling
- text mining
- web documents
- classification algorithm
- text documents
- similarity measure
- parallel corpus
- translation model
- feature selection
- bag of words
- machine learning
- naive bayes
- transfer learning
- statistical machine translation
- machine translation system
- training set
- document clustering
- probabilistic model
- knn
- n gram
- neural network
- data mining
- language model
- knowledge base
- natural language processing
- labeled data
- co occurrence