Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?
Michael-Andrei Panaitescu-LiessZora CheBang AnYuancheng XuPankayaraj PathmanathanSouradip ChakrabortySicheng ZhuTom GoldsteinFurong HuangPublished in: CoRR (2024)
Keyphrases
- language model
- text generation
- training data
- natural language generation
- language modeling
- natural language
- document retrieval
- probabilistic model
- information retrieval
- learning algorithm
- speech recognition
- n gram
- training set
- classification accuracy
- decision trees
- retrieval model
- theorem prover
- language modelling
- query expansion
- query terms
- ad hoc information retrieval
- test collection
- context sensitive
- vector space model
- watermarking algorithm
- statistical language models
- watermarking scheme
- cross lingual
- language models for information retrieval
- digital watermarking
- class labels
- smoothing methods
- language model for information retrieval
- artificial intelligence
- knowledge base
- query specific
- document ranking
- translation model
- machine learning
- relevance model
- dialogue system