Bag of Tricks for Training Data Extraction from Language Models.
Weichen YuTianyu PangQian LiuChao DuBingyi KangYan HuangMin LinShuicheng YanPublished in: CoRR (2023)
Keyphrases
- language model
- data extraction
- language modeling
- n gram
- semi structured
- web data extraction
- probabilistic model
- data integration
- language modelling
- query expansion
- speech recognition
- statistical language models
- test collection
- document retrieval
- information retrieval
- retrieval model
- bag of words
- web pages
- relevance model
- information extraction
- query terms
- pseudo relevance feedback
- document ranking
- context sensitive
- language models for information retrieval
- data sets