DN at SemEval-2023 Task 12: Low-Resource Language Text Classification via Multilingual Pretrained Language Model Fine-tuning.
Daniil HomskiyNarek MaloyanPublished in: SemEval@ACL (2023)
Keyphrases
- fine tuning
- language model
- language modeling
- text classification
- cross lingual
- comparable corpora
- n gram
- language independent
- information retrieval
- probabilistic model
- retrieval model
- cross language
- document retrieval
- statistical language modeling
- language modelling
- query expansion
- text categorization
- feature selection
- bag of words
- speech recognition
- smoothing methods
- machine translation system
- mixture model
- ad hoc information retrieval
- natural language
- machine learning
- multi label
- linguistic resources
- test collection
- fine tuned
- context sensitive
- text mining
- query terms
- semantic features
- pseudo relevance feedback
- digital libraries
- statistical machine translation
- word clouds
- query translation
- knn
- information retrieval systems
- naive bayes
- text documents
- bilingual dictionaries
- document ranking
- cross language information retrieval
- translation model
- term frequency
- relevance model