ViSoBERT: A Pre-Trained Language Model for Vietnamese Social Media Text Processing.
Quoc-Nam NguyenChau-Thang PhanDuc-Vu NguyenKiet Van NguyenPublished in: CoRR (2023)
Keyphrases
- text processing
- language model
- pre trained
- social media
- language modeling
- training data
- information retrieval
- natural language processing
- training examples
- probabilistic model
- document retrieval
- information extraction
- n gram
- query expansion
- speech recognition
- text mining
- retrieval model
- test collection
- control signals
- smoothing methods
- translation model
- ad hoc information retrieval
- vector space model
- mixture model
- machine learning
- query terms
- prior knowledge
- text documents
- statistical model
- image segmentation
- decision trees
- learning algorithm
- data sets