Bag of Tricks for Training Data Extraction from Language Models.
Weichen YuTianyu PangQian LiuChao DuBingyi KangYan HuangMin LinShuicheng YanPublished in: ICML (2023)
Keyphrases
- language model
- data extraction
- language modeling
- semi structured
- web data extraction
- n gram
- document retrieval
- probabilistic model
- speech recognition
- information retrieval
- query expansion
- data integration
- test collection
- language modelling
- information extraction
- retrieval model
- statistical language models
- bag of words
- query terms
- document ranking
- relevance model
- translation model
- smoothing methods
- web pages
- database
- machine learning
- knowledge discovery