PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese.
Rafael Bezerra de Menezes RodriguesPedro Ivo Monteiro PrivattoGustavo José de SousaRafael P. MurariLuis C. S. AfonsoJoão P. PapaDaniel C. G. PedronetteIvan Rizzo GuilhermeStephan R. PerroutAliel F. RientePublished in: PROPOR (2022)
Keyphrases
- language model
- domain adaptation
- language modeling
- n gram
- document retrieval
- probabilistic model
- information retrieval
- multiple sources
- retrieval model
- semi supervised
- labeled data
- cross domain
- query expansion
- test collection
- document classification
- semi supervised learning
- test data
- sentiment classification
- transfer learning
- cross language
- target domain
- training data
- similarity measure
- search engine
- co occurrence
- supervised learning
- learning algorithm