Improving Indonesian Text Classification Using Multilingual Language Model.
Ilham Firdausi PutraAyu PurwariantiPublished in: CoRR (2020)
Keyphrases
- language model
- text classification
- language modeling
- n gram
- cross lingual
- language independent
- language resources
- language modelling
- document retrieval
- retrieval model
- speech recognition
- probabilistic model
- information retrieval
- bag of words
- text categorization
- statistical language modeling
- test collection
- text mining
- feature selection
- translation model
- machine translation
- smoothing methods
- statistical machine translation
- cross language
- mixture model
- query expansion
- ad hoc information retrieval
- term frequency
- statistical language models
- naive bayes
- knn
- context sensitive
- digital libraries
- text classifiers
- word segmentation
- machine learning
- document length
- query terms
- language model for information retrieval
- multi label
- topic modeling
- relevance model
- vector space model
- natural language processing
- information retrieval systems