Evaluating the Role of Language Typology in Transformer-Based Multilingual Text Classification.
Sophie GroenwoldSamhita HonnavalliLily OuAesha ParekhSharon LevyDiba MirzaWilliam Yang WangPublished in: CoRR (2020)
Keyphrases
- text classification
- language independent
- language specific
- n gram
- cross lingual
- language resources
- text categorization
- programming language
- bag of words
- feature selection
- parallel corpus
- text mining
- natural language
- text documents
- machine learning
- naive bayes
- fuzzy logic
- language learning
- neural network
- cross language
- semantic features
- data cleaning
- multilingual documents
- text generation
- comparable corpora
- text data
- language modeling
- bilingual dictionaries
- labeled data
- expert systems
- power system
- fault diagnosis