RoBERTuito: a pre-trained language model for social media text in Spanish.
Juan Manuel PérezDamián Ariel FurmanLaura Alonso AlemanyFranco M. LuquePublished in: LREC (2022)
Keyphrases
- language model
- pre trained
- social media
- information retrieval
- language modeling
- n gram
- document retrieval
- probabilistic model
- query expansion
- multiword
- speech recognition
- retrieval model
- training data
- text retrieval
- training examples
- text mining
- smoothing methods
- ad hoc information retrieval
- control signals
- test collection
- context sensitive
- learning algorithm
- query terms
- mixture model
- keywords
- cross lingual
- relevance model
- visual features
- decision trees