Efficient Domain Adaptation of Language Models via Adaptive Tokenization.
Vin SachidanandaJason S. KesslerYi'an LaiPublished in: CoRR (2021)
Keyphrases
- language model
- domain adaptation
- language modeling
- n gram
- retrieval model
- document retrieval
- test collection
- information retrieval
- multiple sources
- probabilistic model
- document classification
- relevance model
- labeled data
- named entities
- cross domain
- data mining
- semi supervised learning
- query expansion
- information extraction
- pairwise
- training set
- target domain