Generalized Funnelling: Ensemble Learning and Heterogeneous Document Embeddings for Cross-Lingual Text Classification.
Alejandro MoreoAndrea PedrottiFabrizio SebastianiPublished in: CoRR (2021)
Keyphrases
- cross lingual
- text classification
- ensemble learning
- text documents
- unlabeled data
- text classifiers
- feature selection
- text categorization
- text mining
- labeled data
- generalization ability
- language modeling
- naive bayes
- document clustering
- bag of words
- base classifiers
- ensemble methods
- machine learning
- information retrieval
- random forest
- news articles
- multi label
- n gram
- retrieval systems
- knn
- document collections
- dimensionality reduction
- document retrieval
- vector space
- concept drift
- information retrieval systems
- keywords
- learning algorithm
- transfer learning
- decision trees
- neural network
- unsupervised learning
- data streams
- expert systems
- prior knowledge
- probabilistic model
- knowledge discovery
- class labels
- low dimensional