DN at SemEval-2023 Task 12: Low-Resource Language Text Classification via Multilingual Pretrained Language Model Fine-tuning.
Daniil HomskiyNarek MaloyanPublished in: CoRR (2023)
Keyphrases
- query specific
- language model
- fine tuning
- language modeling
- text classification
- cross lingual
- n gram
- comparable corpora
- language independent
- query expansion
- pseudo relevance feedback
- probabilistic model
- document retrieval
- information retrieval
- retrieval model
- mixture model
- test collection
- bag of words
- smoothing methods
- query terms
- translation model
- language modelling
- bilingual dictionaries
- text categorization
- linguistic resources
- fine tuned
- context sensitive
- speech recognition
- machine learning
- digital libraries
- cross language
- text mining
- document ranking
- naive bayes
- feature selection
- text documents
- statistical machine translation
- term frequency
- word segmentation
- machine translation system
- ad hoc information retrieval
- statistical language modeling
- out of vocabulary
- part of speech
- vector space model
- multi label