Heterogeneous document embeddings for cross-lingual text classification.
Alejandro MoreoAndrea PedrottiFabrizio SebastianiPublished in: SAC (2021)
Keyphrases
- cross lingual
- text classification
- document classification
- text documents
- text classifiers
- term frequency
- language modeling
- language independent
- text mining
- text categorization
- cross lingual information retrieval
- feature selection
- word sense
- bag of words
- machine learning
- cross language
- sentiment classification
- document clustering
- information retrieval systems
- labeled data
- text data
- information retrieval
- knn
- transfer learning
- retrieval systems
- news articles
- vector space
- document images
- document collections
- translation model
- language model
- dimensionality reduction
- probabilistic model
- text corpora
- semantic features
- query translation
- low dimensional
- query expansion
- tf idf
- document retrieval